immuneML.preprocessing package

Subpackages

immuneML.preprocessing.filters package

Submodules

immuneML.preprocessing.Preprocessor module

class immuneML.preprocessing.Preprocessor.Preprocessor(result_path: Optional[pathlib.Path] = None)[source]

Bases: object

check_dataset_type(dataset, valid_dataset_types: list, location: str)[source]

keeps_example_count() → bool[source]: Defines if the preprocessing can be run with TrainMLModel instruction; to be able to run with it, the preprocessing cannot change the number of examples in the dataset

abstract process_dataset(dataset: immuneML.data_model.dataset.RepertoireDataset.RepertoireDataset, result_path: pathlib.Path) → immuneML.data_model.dataset.RepertoireDataset.RepertoireDataset[source]

immuneML.preprocessing.SubjectRepertoireCollector module

class immuneML.preprocessing.SubjectRepertoireCollector.SubjectRepertoireCollector(result_path: Optional[pathlib.Path] = None)[source]

Bases: immuneML.preprocessing.Preprocessor.Preprocessor

Merges all the Repertoires in a RepertoireDataset that have the same ‘subject_id’ specified in the metadata. The result is a RepertoireDataset with one Repertoire per subject. This preprocessing cannot be used in combination with TrainMLModel instruction because it can change the number of examples. To combine the repertoires in this way, use this preprocessing with DatasetExport instruction.

YAML specification:

preprocessing_sequences:
    my_preprocessing:
        - my_filter: SubjectRepertoireCollector

keeps_example_count() → bool[source]: Defines if the preprocessing can be run with TrainMLModel instruction; to be able to run with it, the preprocessing cannot change the number of examples in the dataset

process_dataset(dataset: immuneML.data_model.dataset.RepertoireDataset.RepertoireDataset, result_path: Optional[pathlib.Path] = None)[source]

immuneML.preprocessing package

Subpackages

Submodules

immuneML.preprocessing.Preprocessor module

immuneML.preprocessing.SubjectRepertoireCollector module

Module contents