immuneML.preprocessing package
Subpackages
- immuneML.preprocessing.filters package
- Submodules
- immuneML.preprocessing.filters.ChainRepertoireFilter module
- immuneML.preprocessing.filters.ClonesPerRepertoireFilter module
- immuneML.preprocessing.filters.CountAggregationFunction module
- immuneML.preprocessing.filters.CountPerSequenceFilter module
- immuneML.preprocessing.filters.DuplicateSequenceFilter module
- immuneML.preprocessing.filters.Filter module
- immuneML.preprocessing.filters.MetadataRepertoireFilter module
- Module contents
Submodules
immuneML.preprocessing.Preprocessor module
- class immuneML.preprocessing.Preprocessor.Preprocessor(result_path: Optional[pathlib.Path] = None)[source]
Bases:
object
- keeps_example_count() bool [source]
Defines if the preprocessing can be run with TrainMLModel instruction; to be able to run with it, the preprocessing cannot change the number of examples in the dataset
- abstract process_dataset(dataset: immuneML.data_model.dataset.RepertoireDataset.RepertoireDataset, result_path: pathlib.Path) immuneML.data_model.dataset.RepertoireDataset.RepertoireDataset [source]
immuneML.preprocessing.SubjectRepertoireCollector module
- class immuneML.preprocessing.SubjectRepertoireCollector.SubjectRepertoireCollector(result_path: Optional[pathlib.Path] = None)[source]
Bases:
immuneML.preprocessing.Preprocessor.Preprocessor
Merges all the Repertoires in a RepertoireDataset that have the same ‘subject_id’ specified in the metadata. The result is a RepertoireDataset with one Repertoire per subject. This preprocessing cannot be used in combination with TrainMLModel instruction because it can change the number of examples. To combine the repertoires in this way, use this preprocessing with DatasetExport instruction.
YAML specification:
preprocessing_sequences: my_preprocessing: - my_filter: SubjectRepertoireCollector
- keeps_example_count() bool [source]
Defines if the preprocessing can be run with TrainMLModel instruction; to be able to run with it, the preprocessing cannot change the number of examples in the dataset
- process_dataset(dataset: immuneML.data_model.dataset.RepertoireDataset.RepertoireDataset, result_path: Optional[pathlib.Path] = None)[source]