immuneML.reports.train_ml_model_reports package

Submodules

immuneML.reports.train_ml_model_reports.CVFeaturePerformance module

class immuneML.reports.train_ml_model_reports.CVFeaturePerformance.CVFeaturePerformance(feature: Optional[str] = None, state: Optional[immuneML.hyperparameter_optimization.states.TrainMLModelState.TrainMLModelState] = None, result_path: Optional[pathlib.Path] = None, label: Optional[str] = None, name: Optional[str] = None, is_feature_axis_categorical: Optional[bool] = None)[source]

Bases: immuneML.reports.train_ml_model_reports.TrainMLModelReport.TrainMLModelReport

This report plots the average training vs test performance w.r.t. given encoding parameter which is explicitly set in the feature attribute. It can be used only in combination with TrainMLModel instruction and can be only specified under ‘reports’

Parameters
  • feature – name of the encoder parameter w.r.t. which the performance across training and test will be shown. Possible values depend

  • the encoder on which it is used. (on) –

  • is_feature_axis_categorical (bool) – if the x-axis of the plot where features are shown should be categorical; alternatively it is

  • determined based on the feature values (automatically) –

YAML specification:

report1:
    CVFeaturePerformance:
        feature: p_value_threshold # parameter value of SequenceAbundance encoder
        is_feature_axis_categorical: True # show x-axis as categorical
classmethod build_object(**kwargs)[source]

Creates the object of the subclass of the Report class from the parameters so that it can be used in the analysis. Depending on the type of the report, the parameters provided here will be provided in parsing time, while the other necessary parameters (e.g., subset of the data from which the report should be created) will be provided at runtime. For more details, see specific direct subclasses of this class, describing different types of reports.

Parameters

**kwargs – keyword arguments that will be provided by users in the specification (if immuneML is used as a command line tool) or in the dictionary when calling the method from the code, and which should be used to create the report object

Returns

the object of the appropriate report class

check_prerequisites()[source]

Checks prerequisites for the generation of the report of specific class (e.g., if the class of the MLMethod instance is the one required by the report, if the data has been encoded to make a report of encoded dataset). In the instructions in immuneML, this function is used to determine whether to call generate_report() in the specific situation. Each report subclass has its own set of prerequisites. If the report cannot be run, the information on this will be logged and the report skipped in the specific situation. No error will be raised. See subclasses of the class Instruction for more information on how the reports are executed.

Returns

boolean value True if the prerequisites are o.k., and False otherwise.

immuneML.reports.train_ml_model_reports.DiseaseAssociatedSequenceCVOverlap module

class immuneML.reports.train_ml_model_reports.DiseaseAssociatedSequenceCVOverlap.DiseaseAssociatedSequenceCVOverlap(state: Optional[immuneML.hyperparameter_optimization.states.TrainMLModelState.TrainMLModelState] = None, result_path: Optional[pathlib.Path] = None, name: Optional[str] = None, compare_in_selection: bool = False, compare_in_assessment: bool = False)[source]

Bases: immuneML.reports.train_ml_model_reports.TrainMLModelReport.TrainMLModelReport

DiseaseAssociatedSequenceCVOverlap report makes one heatmap per label showing the overlap of disease-associated sequences produced by the SequenceAbundance encoder between folds of cross-validation (either inner or outer loop of the nested CV). The overlap is computed by the following equation:

\[overlap(X,Y) = \frac{|X \cap Y|}{min(|X|, |Y|)} x 100\]

For details, see Greiff V, Menzel U, Miho E, et al. Systems Analysis Reveals High Genetic and Antigen-Driven Predetermination of Antibody Repertoires throughout B Cell Development. Cell Reports. 2017;19(7):1467-1478. doi:10.1016/j.celrep.2017.04.054.

Parameters
  • compare_in_selection (bool) – whether to compute the overlap over the inner loop of the nested CV - the sequence overlap is shown across CV

  • for the model chosen as optimal within that selection (folds) –

  • compare_in_assessment (bool) – whether to compute the overlap over the optimal models in the outer loop of the nested CV

YAML specification:

reports: # the report is defined with all other reports under definitions/reports
    my_overlap_report: DiseaseAssociatedSequenceCVOverlap # report has no parameters
classmethod build_object(**kwargs)[source]

Creates the object of the subclass of the Report class from the parameters so that it can be used in the analysis. Depending on the type of the report, the parameters provided here will be provided in parsing time, while the other necessary parameters (e.g., subset of the data from which the report should be created) will be provided at runtime. For more details, see specific direct subclasses of this class, describing different types of reports.

Parameters

**kwargs – keyword arguments that will be provided by users in the specification (if immuneML is used as a command line tool) or in the dictionary when calling the method from the code, and which should be used to create the report object

Returns

the object of the appropriate report class

immuneML.reports.train_ml_model_reports.MLSettingsPerformance module

class immuneML.reports.train_ml_model_reports.MLSettingsPerformance.MLSettingsPerformance(single_axis_labels, x_label_position, y_label_position, name: Optional[str] = None, state: Optional[immuneML.hyperparameter_optimization.states.TrainMLModelState.TrainMLModelState] = None, result_path: Optional[pathlib.Path] = None)[source]

Bases: immuneML.reports.train_ml_model_reports.TrainMLModelReport.TrainMLModelReport

Report for TrainMLModel instruction: plots the performance for each of the setting combinations as defined under ‘settings’ in the assessment (outer validation) loop.

The performances are grouped by label (horizontal panels) encoding (vertical panels) and ML method (bar color). When multiple data splits are used, the average performance over the data splits is shown with an error bar representing the standard deviation.

This report can be used only with TrainMLModel instruction under ‘reports’.

Parameters
  • single_axis_labels (bool) – whether to use single axis labels. Note that using single axis labels makes the figure unsuited for rescaling, as the label position is given in a fixed distance from the axis. By default, single_axis_labels is False, resulting in standard plotly axis labels.

  • x_label_position (float) – if single_axis_labels is True, this should be an integer specifying the x axis label position relative to the x axis. The default value for label_position is -0.1.

  • y_label_position (float) – same as x_label_position, but for the y axis.

YAML specification:

my_hp_report: MLSettingsPerformance
classmethod build_object(**kwargs)[source]

Creates the object of the subclass of the Report class from the parameters so that it can be used in the analysis. Depending on the type of the report, the parameters provided here will be provided in parsing time, while the other necessary parameters (e.g., subset of the data from which the report should be created) will be provided at runtime. For more details, see specific direct subclasses of this class, describing different types of reports.

Parameters

**kwargs – keyword arguments that will be provided by users in the specification (if immuneML is used as a command line tool) or in the dictionary when calling the method from the code, and which should be used to create the report object

Returns

the object of the appropriate report class

check_prerequisites()[source]

Checks prerequisites for the generation of the report of specific class (e.g., if the class of the MLMethod instance is the one required by the report, if the data has been encoded to make a report of encoded dataset). In the instructions in immuneML, this function is used to determine whether to call generate_report() in the specific situation. Each report subclass has its own set of prerequisites. If the report cannot be run, the information on this will be logged and the report skipped in the specific situation. No error will be raised. See subclasses of the class Instruction for more information on how the reports are executed.

Returns

boolean value True if the prerequisites are o.k., and False otherwise.

std(x)[source]

immuneML.reports.train_ml_model_reports.MLSubseqPerformance module

class immuneML.reports.train_ml_model_reports.MLSubseqPerformance.MLSubseqPerformance(name: Optional[str] = None)[source]

Bases: immuneML.reports.train_ml_model_reports.MLSettingsPerformance.MLSettingsPerformance

Report for TrainMLModel: Similar to MLSettingsPerformance, this report plots the performance of certain combinations of encodings and ML methods.

Similarly to MLSettingsPerformance, the performances are grouped by label (horizontal panels). However, the bar color is determined by the ml method class (thus several ML methods with different parameters may be grouped together) and the vertical panel grouping is determined by the subsequence size used for motif recovery. This subsequence size is either the k-mer size or the kernel size (DeepRC).

This report can only be used to plot the results for setting combinations using k-mer encoding with continuous k-mers (in combination with any ML method), or DeepRC encoding + ml method.

This report can only be used with TrainMLModel instruction under ‘reports’.

YAML specification:

my_hp_report: MLSubseqPerformance
classmethod build_object(**kwargs)[source]

Creates the object of the subclass of the Report class from the parameters so that it can be used in the analysis. Depending on the type of the report, the parameters provided here will be provided in parsing time, while the other necessary parameters (e.g., subset of the data from which the report should be created) will be provided at runtime. For more details, see specific direct subclasses of this class, describing different types of reports.

Parameters

**kwargs – keyword arguments that will be provided by users in the specification (if immuneML is used as a command line tool) or in the dictionary when calling the method from the code, and which should be used to create the report object

Returns

the object of the appropriate report class

check_prerequisites()[source]

Checks prerequisites for the generation of the report of specific class (e.g., if the class of the MLMethod instance is the one required by the report, if the data has been encoded to make a report of encoded dataset). In the instructions in immuneML, this function is used to determine whether to call generate_report() in the specific situation. Each report subclass has its own set of prerequisites. If the report cannot be run, the information on this will be logged and the report skipped in the specific situation. No error will be raised. See subclasses of the class Instruction for more information on how the reports are executed.

Returns

boolean value True if the prerequisites are o.k., and False otherwise.

immuneML.reports.train_ml_model_reports.ROCCurveSummary module

class immuneML.reports.train_ml_model_reports.ROCCurveSummary.ROCCurveSummary(name: Optional[str] = None, state: Optional[immuneML.hyperparameter_optimization.states.TrainMLModelState.TrainMLModelState] = None, result_path: Optional[pathlib.Path] = None)[source]

Bases: immuneML.reports.train_ml_model_reports.TrainMLModelReport.TrainMLModelReport

This report plots ROC curves for all trained ML settings ([preprocessing], encoding, ML model) in the outer loop of cross-validation in TrainMLModel instruction. If there are multiple splits in the outer loop, this report will make one plot per split. This report is defined only for binary classification. If there are multiple labels defined in the instruction, each label has to have two classes to be included in this report.

Arguments: there are no arguments for this report.

YAML specification:


reports:

my_roc_summary_report: ROCCurveSummary

classmethod build_object(**kwargs)[source]

Creates the object of the subclass of the Report class from the parameters so that it can be used in the analysis. Depending on the type of the report, the parameters provided here will be provided in parsing time, while the other necessary parameters (e.g., subset of the data from which the report should be created) will be provided at runtime. For more details, see specific direct subclasses of this class, describing different types of reports.

Parameters

**kwargs – keyword arguments that will be provided by users in the specification (if immuneML is used as a command line tool) or in the dictionary when calling the method from the code, and which should be used to create the report object

Returns

the object of the appropriate report class

immuneML.reports.train_ml_model_reports.ReferenceSequenceOverlap module

class immuneML.reports.train_ml_model_reports.ReferenceSequenceOverlap.ReferenceSequenceOverlap(reference_path: Optional[pathlib.Path] = None, comparison_attributes: Optional[list] = None, name: Optional[str] = None, state: Optional[immuneML.hyperparameter_optimization.states.TrainMLModelState.TrainMLModelState] = None, result_path: Optional[pathlib.Path] = None, label: Optional[str] = None)[source]

Bases: immuneML.reports.train_ml_model_reports.TrainMLModelReport.TrainMLModelReport

The ReferenceSequenceOverlap report compares a list of disease-associated sequences produced by the SequenceAbundance encoder to a list of reference receptor sequences. It outputs a Venn diagram and a list of receptor sequences found both in the encoder and reference.

The report compares the sequences by their sequence content and the additional comparison_attributes (such as V or J gene), as specified by the user.

Parameters
  • reference_path (str) – path to the reference file in csv format which contains one entry per row and has columns that correspond to the attributes

  • under comparison_attributes argument (listed) –

  • comparison_attributes (list) – list of attributes to use for comparison; all of them have to be present in the reference file where they should

  • the names of the columns (be) –

  • label (str) – name of the label for which the reference sequences should be compared to the model; if none, it takes the one label from the

  • if it is none and multiple labels were specified for the instruction (instruction;) –

  • report will not be generated (the) –

YAML specification:

reports: # the report is defined with all other reports under definitions/reports
    my_reference_overlap_report:
        ReferenceSequenceOverlap:
            reference_path: reference.csv # a reference file with columns listed under comparison_attributes
            comparison_attributes:
                - sequence_aas
                - v_genes
                - j_genes
classmethod build_object(**kwargs)[source]

Creates the object of the subclass of the Report class from the parameters so that it can be used in the analysis. Depending on the type of the report, the parameters provided here will be provided in parsing time, while the other necessary parameters (e.g., subset of the data from which the report should be created) will be provided at runtime. For more details, see specific direct subclasses of this class, describing different types of reports.

Parameters

**kwargs – keyword arguments that will be provided by users in the specification (if immuneML is used as a command line tool) or in the dictionary when calling the method from the code, and which should be used to create the report object

Returns

the object of the appropriate report class

check_prerequisites()[source]

Checks prerequisites for the generation of the report of specific class (e.g., if the class of the MLMethod instance is the one required by the report, if the data has been encoded to make a report of encoded dataset). In the instructions in immuneML, this function is used to determine whether to call generate_report() in the specific situation. Each report subclass has its own set of prerequisites. If the report cannot be run, the information on this will be logged and the report skipped in the specific situation. No error will be raised. See subclasses of the class Instruction for more information on how the reports are executed.

Returns

boolean value True if the prerequisites are o.k., and False otherwise.

immuneML.reports.train_ml_model_reports.TrainMLModelReport module

class immuneML.reports.train_ml_model_reports.TrainMLModelReport.TrainMLModelReport(name: Optional[str] = None, state: Optional[immuneML.hyperparameter_optimization.states.TrainMLModelState.TrainMLModelState] = None, result_path: Optional[pathlib.Path] = None)[source]

Bases: immuneML.reports.Report.Report

Train ML model reports plot general statistics or export data of multiple models simultaneously when running the TrainMLModel instruction.

In the TrainMLModel instruction, train ML model reports can be specified under ‘reports’.

When using the reports with TrainMLModel instruction, the arguments defined below are set at runtime by the instruction. Concrete classes inheriting TrainMLModelReport may include additional parameters that will be set by the user in the form of input arguments.

Parameters
  • name (str) – user-defined name of the report used in the HTML overview automatically generated by the platform

  • state (TrainMLModelState) – a state object that includes all the information, trained models, encodings and datasets from the nested cross-validation procedure used to train the optimal model.

  • result_path (Path) – location where the report results will be stored

static get_title()[source]

Module contents