immuneML.preprocessing package

Submodules

immuneML.preprocessing.Preprocessor module

class immuneML.preprocessing.Preprocessor.Preprocessor(result_path: Optional[pathlib.Path] = None)[source]

Bases: object

check_dataset_type(dataset, valid_dataset_types: list, location: str)[source]
keeps_example_count() → bool[source]

Defines if the preprocessing can be run with TrainMLModel instruction; to be able to run with it, the preprocessing cannot change the number of examples in the dataset

abstract process_dataset(dataset: immuneML.data_model.dataset.RepertoireDataset.RepertoireDataset, result_path: pathlib.Path)immuneML.data_model.dataset.RepertoireDataset.RepertoireDataset[source]

immuneML.preprocessing.SubjectRepertoireCollector module

class immuneML.preprocessing.SubjectRepertoireCollector.SubjectRepertoireCollector(result_path: Optional[pathlib.Path] = None)[source]

Bases: immuneML.preprocessing.Preprocessor.Preprocessor

Merges all the Repertoires in a RepertoireDataset that have the same ‘subject_id’ specified in the metadata. The result is a RepertoireDataset with one Repertoire per subject. This preprocessing cannot be used in combination with TrainMLModel instruction because it can change the number of examples. To combine the repertoires in this way, use this preprocessing with DatasetExport instruction.

YAML specification:

preprocessing_sequences:
    my_preprocessing:
        - my_filter: SubjectRepertoireCollector
keeps_example_count() → bool[source]

Defines if the preprocessing can be run with TrainMLModel instruction; to be able to run with it, the preprocessing cannot change the number of examples in the dataset

process_dataset(dataset: immuneML.data_model.dataset.RepertoireDataset.RepertoireDataset, result_path: Optional[pathlib.Path] = None)[source]

Module contents