immuneML.reports.train_gen_model_reports package

Submodules

immuneML.reports.train_gen_model_reports.KLKmerComparison module

class immuneML.reports.train_gen_model_reports.KLKmerComparison.KLKmerComparison(original_dataset: Dataset = None, generated_dataset: Dataset = None, result_path: Path = None, name: str = None, number_of_processes: int = 1, model: GenerativeModel = None, k: int = 3, n_sequences: int = 50, sequence_type: SequenceType = SequenceType.AMINO_ACID, region_type: RegionType = RegionType.IMGT_CDR3)[source]

Bases: TrainGenModelReport

Estimates the KL divergence between the kmer-distributions of the original and generated dataset, and makes a plots that shows which sequences (and which kmers) contribute the most to the divergence.

Specification arguments:

  • k (int): The kmer length to use for the KL divergence estimation. By default, k is set to 3.

  • n_sequences (int): The number of sequences to make the plot from (the sequences that contribute the most to the KL divergence). By default, n_sequences is set to 50.

YAML specification:

my_kl_report:
  KLKmerComparison:
    k: 3
    n_sequences: 50
__init__(original_dataset: Dataset = None, generated_dataset: Dataset = None, result_path: Path = None, name: str = None, number_of_processes: int = 1, model: GenerativeModel = None, k: int = 3, n_sequences: int = 50, sequence_type: SequenceType = SequenceType.AMINO_ACID, region_type: RegionType = RegionType.IMGT_CDR3)[source]

The arguments defined below are set at runtime by the instruction.

Parameters:
  • original_dataset (Dataset) – a dataset object (can be repertoire, receptor or sequence dataset, depending

  • instruction (on the specific report) provided as input to the TrainGenModel)

  • generated_dataset (Dataset) – a dataset object as produced from the generative model after being trained on

  • dataset (the original)

  • result_path (Path) – location where the results (plots, tables, etc.) will be stored

  • name (str) – user-defined name of the report used in the HTML overview automatically generated by the

  • YAML (platform from the key used to define the report in the)

  • number_of_processes (int) – how many processes should be created at once to speed up the analysis.

  • machines (For personal)

  • choice. (4 or 8 is usually a good)

  • model (GenerativeModel) – trained generative model from the instruction

classmethod build_object(**kwargs)[source]

Creates the object of the subclass of the Report class from the parameters so that it can be used in the analysis. Depending on the type of the report, the parameters provided here will be provided in parsing time, while the other necessary parameters (e.g., subset of the data from which the report should be created) will be provided at runtime. For more details, see specific direct subclasses of this class, describing different types of reports.

Parameters:

**kwargs – keyword arguments that will be provided by users in the specification (if immuneML is used as a command line tool) or in the dictionary when calling the method from the code, and which should be used to create the report object

Returns:

the object of the appropriate report class

static get_title()[source]

immuneML.reports.train_gen_model_reports.TrainGenModelReport module

class immuneML.reports.train_gen_model_reports.TrainGenModelReport.TrainGenModelReport(original_dataset: Dataset = None, generated_dataset: Dataset = None, result_path: Path = None, name: str = None, number_of_processes: int = 1, model: GenerativeModel = None)[source]

Bases: Report

TrainGenModel reports show some type of features or statistics comparing two datasets: the original and generated one, potentially in combination with the trained model. These reports can only be used inside TrainGenModel instruction with the aim of comparing two datasets: the dataset used to train a generative model and the dataset created from the trained model.

__init__(original_dataset: Dataset = None, generated_dataset: Dataset = None, result_path: Path = None, name: str = None, number_of_processes: int = 1, model: GenerativeModel = None)[source]

The arguments defined below are set at runtime by the instruction. Concrete classes inheriting DataComparisonReport may include additional parameters that will be set by the user in the form of input arguments (e.g., from the YAML file).

Parameters:
  • original_dataset (Dataset) – a dataset object (can be repertoire, receptor or sequence dataset, depending

  • instruction (on the specific report) provided as input to the TrainGenModel)

  • generated_dataset (Dataset) – a dataset object as produced from the generative model after being trained on

  • dataset (the original)

  • result_path (Path) – location where the results (plots, tables, etc.) will be stored

  • name (str) – user-defined name of the report used in the HTML overview automatically generated by the

  • YAML (platform from the key used to define the report in the)

  • number_of_processes (int) – how many processes should be created at once to speed up the analysis.

  • machines (For personal)

  • choice. (4 or 8 is usually a good)

  • model (GenerativeModel) – trained generative model from the instruction

static get_title()[source]

Module contents