immuneML.workflows.instructions.subsampling package¶
Submodules¶
immuneML.workflows.instructions.subsampling.SubsamplingInstruction module¶
-
class
immuneML.workflows.instructions.subsampling.SubsamplingInstruction.
SubsamplingInstruction
(dataset: immuneML.data_model.dataset.Dataset.Dataset, subsampled_dataset_sizes: List[int], dataset_export_formats: list, result_path: Optional[pathlib.Path] = None, name: Optional[str] = None)[source]¶ Bases:
immuneML.workflows.instructions.Instruction.Instruction
Subsampling is an instruction that subsamples a given dataset and creates multiple smaller dataset according to the parameters provided.
- Parameters
dataset (Dataset) – original dataset which will be used as a basis for subsampling
subsampled_dataset_sizes (list) – a list of dataset sizes (number of examples) each subsampled dataset should have
dataset_export_formats (list) – in which formats to export the subsampled datasets. Valid formats are class names of any non-abstract class inheriting
DataExporter
.
YAML specification:
my_subsampling_instruction: # user-defined name of the instruction type: Subsampling # which instruction to execute dataset: my_dataset # original dataset to be subsampled, with e.g., 300 examples subsampled_dataset_sizes: # how large the subsampled datasets should be, one dataset will be created for each list item - 200 # one subsampled dataset with 200 examples (200 repertoires if my_dataset was repertoire dataset) - 100 # the other subsampled dataset will have 100 examples dataset_export_formats: # in which formats to export the subsampled datasets - ImmuneML - AIRR
immuneML.workflows.instructions.subsampling.SubsamplingState module¶
-
class
immuneML.workflows.instructions.subsampling.SubsamplingState.
SubsamplingState
(dataset: immuneML.data_model.dataset.Dataset.Dataset, subsampled_dataset_sizes: List[int] = <factory>, dataset_exporters: List[immuneML.IO.dataset_export.DataExporter.DataExporter] = <factory>, result_path: pathlib.Path = None, name: str = None, subsampled_datasets: List[immuneML.data_model.dataset.Dataset.Dataset] = <factory>, subsampled_dataset_paths: dict = <factory>)[source]¶ Bases:
object
-
dataset_exporters
: List[immuneML.IO.dataset_export.DataExporter.DataExporter]¶
-
name
: str = None¶
-
result_path
: pathlib.Path = None¶
-
subsampled_dataset_paths
: dict¶
-
subsampled_dataset_sizes
: List[int]¶
-
subsampled_datasets
: List[immuneML.data_model.dataset.Dataset.Dataset]¶
-