immuneML.workflows.instructions.ml_model_application package¶

Submodules¶

immuneML.workflows.instructions.ml_model_application.MLApplicationInstruction module¶

class immuneML.workflows.instructions.ml_model_application.MLApplicationInstruction.MLApplicationInstruction(dataset: Dataset, label_configuration: LabelConfiguration, hp_setting: HPSetting, metrics: List[ClassificationMetric], number_of_processes: int, name: str)[source]¶

Bases: Instruction

Instruction which enables using trained ML models and encoders on new datasets which do not necessarily have labeled data. When the same label is provided as the ML setting was trained for, performance metrics can be computed.

The predictions are stored in the predictions.csv in the result path in the following format:

example_id	cmv_predicted_class	cmv_1_proba	cmv_0_proba
e1	1	0.8	0.2
e2	0	0.2	0.8
e3	1	0.78	0.22

If the same label that the ML setting was trained for is present in the provided dataset, the ‘true’ label value will be added to the predictions table in addition:

example_id	cmv_predicted_class	cmv_1_proba	cmv_0_proba	cmv_true_class
e1	1	0.8	0.2	1
e2	0	0.2	0.8	0
e3	1	0.78	0.22	0

Specification arguments:

dataset: dataset for which examples need to be classified
config_path: path to the zip file exported from MLModelTraining instruction (which includes train ML model, encoder, preprocessing etc.)
number_of_processes (int): how many processes should be created at once to speed up the analysis. For personal machines, 4 or 8 is usually a good choice.
metrics (list): a list of metrics to compute between the true and predicted classes. These metrics will only be computed when the same label with the same classes is provided for the dataset as the original label the ML setting was trained for.

YAML specification:

instructions:
    instruction_name:
        type: MLApplication
        dataset: d1
        config_path: ./config.zip
        metrics:
        - accuracy
        - precision
        - recall
        number_of_processes: 4

static get_documentation()[source]¶

run(result_path: Path)[source]¶

immuneML.workflows.instructions.ml_model_application.MLApplicationState module¶

class immuneML.workflows.instructions.ml_model_application.MLApplicationState.MLApplicationState(dataset: immuneML.data_model.datasets.Dataset.Dataset, hp_setting: immuneML.hyperparameter_optimization.HPSetting.HPSetting, label_config: immuneML.environment.LabelConfiguration.LabelConfiguration, pool_size: int, name: str, metrics: list = None, path: pathlib.Path = None, predictions_path: pathlib.Path = None, metrics_path: pathlib.Path = None)[source]¶

Bases: object

dataset: Dataset¶

hp_setting: HPSetting¶

label_config: LabelConfiguration¶

metrics: list = None¶

metrics_path: Path = None¶

name: str¶

path: Path = None¶

pool_size: int¶

predictions_path: Path = None¶

immuneML.workflows.instructions.ml_model_application package¶

Submodules¶

immuneML.workflows.instructions.ml_model_application.MLApplicationInstruction module¶

immuneML.workflows.instructions.ml_model_application.MLApplicationState module¶

Module contents¶