immuneML.reports.train_ml_model_reports package¶
Submodules¶
immuneML.reports.train_ml_model_reports.CVFeaturePerformance module¶
-
class
immuneML.reports.train_ml_model_reports.CVFeaturePerformance.
CVFeaturePerformance
(feature: Optional[str] = None, state: Optional[immuneML.hyperparameter_optimization.states.TrainMLModelState.TrainMLModelState] = None, result_path: Optional[pathlib.Path] = None, label: Optional[str] = None, name: Optional[str] = None, is_feature_axis_categorical: Optional[bool] = None)[source]¶ Bases:
immuneML.reports.train_ml_model_reports.TrainMLModelReport.TrainMLModelReport
This report plots the average training vs test performance w.r.t. given encoding parameter which is explicitly set in the feature attribute. It can be used only in combination with TrainMLModel instruction and can be only specified under ‘reports’
- Parameters
feature – name of the encoder parameter w.r.t. which the performance across training and test will be shown. Possible values depend
the encoder on which it is used. (on) –
is_feature_axis_categorical (bool) – if the x-axis of the plot where features are shown should be categorical; alternatively it is
determined based on the feature values (automatically) –
YAML specification:
report1: CVFeaturePerformance: feature: p_value_threshold # parameter value of SequenceAbundance encoder is_feature_axis_categorical: True # show x-axis as categorical
-
classmethod
build_object
(**kwargs)[source]¶ Creates the object of the subclass of the Report class from the parameters so that it can be used in the analysis. Depending on the type of the report, the parameters provided here will be provided in parsing time, while the other necessary parameters (e.g., subset of the data from which the report should be created) will be provided at runtime. For more details, see specific direct subclasses of this class, describing different types of reports.
- Parameters
**kwargs – keyword arguments that will be provided by users in the specification (if immuneML is used as a command line tool) or in the dictionary when calling the method from the code, and which should be used to create the report object
- Returns
the object of the appropriate report class
-
check_prerequisites
()[source]¶ Checks prerequisites for the generation of the report of specific class (e.g., if the class of the MLMethod instance is the one required by the report, if the data has been encoded to make a report of encoded dataset). In the instructions in immuneML, this function is used to determine whether to call generate_report() in the specific situation. Each report subclass has its own set of prerequisites. If the report cannot be run, the information on this will be logged and the report skipped in the specific situation. No error will be raised. See subclasses of the class
Instruction
for more information on how the reports are executed.- Returns
boolean value True if the prerequisites are o.k., and False otherwise.
immuneML.reports.train_ml_model_reports.DiseaseAssociatedSequenceCVOverlap module¶
-
class
immuneML.reports.train_ml_model_reports.DiseaseAssociatedSequenceCVOverlap.
DiseaseAssociatedSequenceCVOverlap
(state: Optional[immuneML.hyperparameter_optimization.states.TrainMLModelState.TrainMLModelState] = None, result_path: Optional[pathlib.Path] = None, name: Optional[str] = None, compare_in_selection: bool = False, compare_in_assessment: bool = False)[source]¶ Bases:
immuneML.reports.train_ml_model_reports.TrainMLModelReport.TrainMLModelReport
DiseaseAssociatedSequenceCVOverlap report makes one heatmap per label showing the overlap of disease-associated sequences produced by the SequenceAbundance encoder between folds of cross-validation (either inner or outer loop of the nested CV). The overlap is computed by the following equation:
\[overlap(X,Y) = \frac{|X \cap Y|}{min(|X|, |Y|)} x 100\]For details, see Greiff V, Menzel U, Miho E, et al. Systems Analysis Reveals High Genetic and Antigen-Driven Predetermination of Antibody Repertoires throughout B Cell Development. Cell Reports. 2017;19(7):1467-1478. doi:10.1016/j.celrep.2017.04.054.
- Parameters
compare_in_selection (bool) – whether to compute the overlap over the inner loop of the nested CV - the sequence overlap is shown across CV
for the model chosen as optimal within that selection (folds) –
compare_in_assessment (bool) – whether to compute the overlap over the optimal models in the outer loop of the nested CV
YAML specification:
reports: # the report is defined with all other reports under definitions/reports my_overlap_report: DiseaseAssociatedSequenceCVOverlap # report has no parameters
-
classmethod
build_object
(**kwargs)[source]¶ Creates the object of the subclass of the Report class from the parameters so that it can be used in the analysis. Depending on the type of the report, the parameters provided here will be provided in parsing time, while the other necessary parameters (e.g., subset of the data from which the report should be created) will be provided at runtime. For more details, see specific direct subclasses of this class, describing different types of reports.
- Parameters
**kwargs – keyword arguments that will be provided by users in the specification (if immuneML is used as a command line tool) or in the dictionary when calling the method from the code, and which should be used to create the report object
- Returns
the object of the appropriate report class
immuneML.reports.train_ml_model_reports.MLSettingsPerformance module¶
-
class
immuneML.reports.train_ml_model_reports.MLSettingsPerformance.
MLSettingsPerformance
(single_axis_labels, x_label_position, y_label_position, name: Optional[str] = None, state: Optional[immuneML.hyperparameter_optimization.states.TrainMLModelState.TrainMLModelState] = None, result_path: Optional[pathlib.Path] = None)[source]¶ Bases:
immuneML.reports.train_ml_model_reports.TrainMLModelReport.TrainMLModelReport
Report for TrainMLModel instruction: plots the performance for each of the setting combinations as defined under ‘settings’ in the assessment (outer validation) loop.
The performances are grouped by label (horizontal panels) encoding (vertical panels) and ML method (bar color). When multiple data splits are used, the average performance over the data splits is shown with an error bar representing the standard deviation.
This report can be used only with TrainMLModel instruction under ‘reports’.
- Parameters
single_axis_labels (bool) – whether to use single axis labels. Note that using single axis labels makes the figure unsuited for rescaling, as the label position is given in a fixed distance from the axis. By default, single_axis_labels is False, resulting in standard plotly axis labels.
x_label_position (float) – if single_axis_labels is True, this should be an integer specifying the x axis label position relative to the x axis. The default value for label_position is -0.1.
y_label_position (float) – same as x_label_position, but for the y axis.
YAML specification:
my_hp_report: MLSettingsPerformance
-
classmethod
build_object
(**kwargs)[source]¶ Creates the object of the subclass of the Report class from the parameters so that it can be used in the analysis. Depending on the type of the report, the parameters provided here will be provided in parsing time, while the other necessary parameters (e.g., subset of the data from which the report should be created) will be provided at runtime. For more details, see specific direct subclasses of this class, describing different types of reports.
- Parameters
**kwargs – keyword arguments that will be provided by users in the specification (if immuneML is used as a command line tool) or in the dictionary when calling the method from the code, and which should be used to create the report object
- Returns
the object of the appropriate report class
-
check_prerequisites
()[source]¶ Checks prerequisites for the generation of the report of specific class (e.g., if the class of the MLMethod instance is the one required by the report, if the data has been encoded to make a report of encoded dataset). In the instructions in immuneML, this function is used to determine whether to call generate_report() in the specific situation. Each report subclass has its own set of prerequisites. If the report cannot be run, the information on this will be logged and the report skipped in the specific situation. No error will be raised. See subclasses of the class
Instruction
for more information on how the reports are executed.- Returns
boolean value True if the prerequisites are o.k., and False otherwise.
immuneML.reports.train_ml_model_reports.MLSubseqPerformance module¶
-
class
immuneML.reports.train_ml_model_reports.MLSubseqPerformance.
MLSubseqPerformance
(name: Optional[str] = None)[source]¶ Bases:
immuneML.reports.train_ml_model_reports.MLSettingsPerformance.MLSettingsPerformance
Report for TrainMLModel: Similar to
MLSettingsPerformance
, this report plots the performance of certain combinations of encodings and ML methods.Similarly to MLSettingsPerformance, the performances are grouped by label (horizontal panels). However, the bar color is determined by the ml method class (thus several ML methods with different parameters may be grouped together) and the vertical panel grouping is determined by the subsequence size used for motif recovery. This subsequence size is either the k-mer size or the kernel size (DeepRC).
This report can only be used to plot the results for setting combinations using k-mer encoding with continuous k-mers (in combination with any ML method), or DeepRC encoding + ml method.
This report can only be used with TrainMLModel instruction under ‘reports’.
YAML specification:
my_hp_report: MLSubseqPerformance
-
classmethod
build_object
(**kwargs)[source]¶ Creates the object of the subclass of the Report class from the parameters so that it can be used in the analysis. Depending on the type of the report, the parameters provided here will be provided in parsing time, while the other necessary parameters (e.g., subset of the data from which the report should be created) will be provided at runtime. For more details, see specific direct subclasses of this class, describing different types of reports.
- Parameters
**kwargs – keyword arguments that will be provided by users in the specification (if immuneML is used as a command line tool) or in the dictionary when calling the method from the code, and which should be used to create the report object
- Returns
the object of the appropriate report class
-
check_prerequisites
()[source]¶ Checks prerequisites for the generation of the report of specific class (e.g., if the class of the MLMethod instance is the one required by the report, if the data has been encoded to make a report of encoded dataset). In the instructions in immuneML, this function is used to determine whether to call generate_report() in the specific situation. Each report subclass has its own set of prerequisites. If the report cannot be run, the information on this will be logged and the report skipped in the specific situation. No error will be raised. See subclasses of the class
Instruction
for more information on how the reports are executed.- Returns
boolean value True if the prerequisites are o.k., and False otherwise.
-
classmethod
immuneML.reports.train_ml_model_reports.ROCCurveSummary module¶
-
class
immuneML.reports.train_ml_model_reports.ROCCurveSummary.
ROCCurveSummary
(name: Optional[str] = None, state: Optional[immuneML.hyperparameter_optimization.states.TrainMLModelState.TrainMLModelState] = None, result_path: Optional[pathlib.Path] = None)[source]¶ Bases:
immuneML.reports.train_ml_model_reports.TrainMLModelReport.TrainMLModelReport
This report plots ROC curves for all trained ML settings ([preprocessing], encoding, ML model) in the outer loop of cross-validation in TrainMLModel instruction. If there are multiple splits in the outer loop, this report will make one plot per split. This report is defined only for binary classification. If there are multiple labels defined in the instruction, each label has to have two classes to be included in this report.
Arguments: there are no arguments for this report.
YAML specification:
- reports:
my_roc_summary_report: ROCCurveSummary
-
classmethod
build_object
(**kwargs)[source]¶ Creates the object of the subclass of the Report class from the parameters so that it can be used in the analysis. Depending on the type of the report, the parameters provided here will be provided in parsing time, while the other necessary parameters (e.g., subset of the data from which the report should be created) will be provided at runtime. For more details, see specific direct subclasses of this class, describing different types of reports.
- Parameters
**kwargs – keyword arguments that will be provided by users in the specification (if immuneML is used as a command line tool) or in the dictionary when calling the method from the code, and which should be used to create the report object
- Returns
the object of the appropriate report class
immuneML.reports.train_ml_model_reports.ReferenceSequenceOverlap module¶
-
class
immuneML.reports.train_ml_model_reports.ReferenceSequenceOverlap.
ReferenceSequenceOverlap
(reference_path: Optional[pathlib.Path] = None, comparison_attributes: Optional[list] = None, name: Optional[str] = None, state: Optional[immuneML.hyperparameter_optimization.states.TrainMLModelState.TrainMLModelState] = None, result_path: Optional[pathlib.Path] = None, label: Optional[str] = None)[source]¶ Bases:
immuneML.reports.train_ml_model_reports.TrainMLModelReport.TrainMLModelReport
The ReferenceSequenceOverlap report compares a list of disease-associated sequences produced by the SequenceAbundance encoder to a list of reference receptor sequences. It outputs a Venn diagram and a list of receptor sequences found both in the encoder and reference.
The report compares the sequences by their sequence content and the additional comparison_attributes (such as V or J gene), as specified by the user.
- Parameters
reference_path (str) – path to the reference file in csv format which contains one entry per row and has columns that correspond to the attributes
under comparison_attributes argument (listed) –
comparison_attributes (list) – list of attributes to use for comparison; all of them have to be present in the reference file where they should
the names of the columns (be) –
label (str) – name of the label for which the reference sequences should be compared to the model; if none, it takes the one label from the
if it is none and multiple labels were specified for the instruction (instruction;) –
report will not be generated (the) –
YAML specification:
reports: # the report is defined with all other reports under definitions/reports my_reference_overlap_report: ReferenceSequenceOverlap: reference_path: reference.csv # a reference file with columns listed under comparison_attributes comparison_attributes: - sequence_aas - v_genes - j_genes
-
classmethod
build_object
(**kwargs)[source]¶ Creates the object of the subclass of the Report class from the parameters so that it can be used in the analysis. Depending on the type of the report, the parameters provided here will be provided in parsing time, while the other necessary parameters (e.g., subset of the data from which the report should be created) will be provided at runtime. For more details, see specific direct subclasses of this class, describing different types of reports.
- Parameters
**kwargs – keyword arguments that will be provided by users in the specification (if immuneML is used as a command line tool) or in the dictionary when calling the method from the code, and which should be used to create the report object
- Returns
the object of the appropriate report class
-
check_prerequisites
()[source]¶ Checks prerequisites for the generation of the report of specific class (e.g., if the class of the MLMethod instance is the one required by the report, if the data has been encoded to make a report of encoded dataset). In the instructions in immuneML, this function is used to determine whether to call generate_report() in the specific situation. Each report subclass has its own set of prerequisites. If the report cannot be run, the information on this will be logged and the report skipped in the specific situation. No error will be raised. See subclasses of the class
Instruction
for more information on how the reports are executed.- Returns
boolean value True if the prerequisites are o.k., and False otherwise.
immuneML.reports.train_ml_model_reports.TrainMLModelReport module¶
-
class
immuneML.reports.train_ml_model_reports.TrainMLModelReport.
TrainMLModelReport
(name: Optional[str] = None, state: Optional[immuneML.hyperparameter_optimization.states.TrainMLModelState.TrainMLModelState] = None, result_path: Optional[pathlib.Path] = None)[source]¶ Bases:
immuneML.reports.Report.Report
Train ML model reports plot general statistics or export data of multiple models simultaneously when running the TrainMLModel instruction.
In the TrainMLModel instruction, train ML model reports can be specified under ‘reports’.
When using the reports with TrainMLModel instruction, the arguments defined below are set at runtime by the instruction. Concrete classes inheriting TrainMLModelReport may include additional parameters that will be set by the user in the form of input arguments.
- Parameters
name (str) – user-defined name of the report used in the HTML overview automatically generated by the platform
state (TrainMLModelState) – a state object that includes all the information, trained models, encodings and datasets from the nested cross-validation procedure used to train the optimal model.
result_path (Path) – location where the report results will be stored