immuneML.dsl.instruction_parsers package
Submodules
immuneML.dsl.instruction_parsers.DatasetExportParser module
- class immuneML.dsl.instruction_parsers.DatasetExportParser.DatasetExportParser[source]
Bases:
object
Specification of instruction with a random datasets:
- definitions:
- datasets:
- my_generated_dataset: # a dataset to be exported in the given format
format: RandomRepertoireDataset params:
result_path: generated_dataset/ repertoire_count: 100 sequence_count_probabilities:
100: 0.5 120: 0.5
- sequence_length_probabilities:
12: 0.333 13: 0.333 14: 0.333
- labels:
- immune_event_1:
yes: 0.5 no: 0.5
- preprocessing_sequences:
- my_preprocessing:
- my_filter:
- ClonesPerRepertoireFilter:
lower_limit: 110 upper_limit: 200
- instructions:
- my_instruction1: # instruction name
type: DatasetExport datasets: # list of datasets to export
my_generated_dataset
preprocessing_sequence: my_preprocessing_sequence export_formats: # list of formats to export the datasets to
AIRR
ImmuneML
- OPTIONAL_KEYS = ['preprocessing_sequence']
- REQUIRED_KEYS = ['type', 'datasets', 'export_formats']
- parse(key: str, instruction: dict, symbol_table: immuneML.dsl.symbol_table.SymbolTable.SymbolTable, path: Optional[pathlib.Path] = None) immuneML.workflows.instructions.dataset_generation.DatasetExportInstruction.DatasetExportInstruction [source]
immuneML.dsl.instruction_parsers.ExploratoryAnalysisParser module
- class immuneML.dsl.instruction_parsers.ExploratoryAnalysisParser.ExploratoryAnalysisParser[source]
Bases:
object
The specification consists of a list of analyses that need to be performed;
Each analysis is defined by a dataset identifier, a report identifier and optionally encoding and labels and are loaded into ExploratoryAnalysisUnit objects;
DSL example for ExploratoryAnalysisInstruction assuming that d1, r1, r2, e1 are defined previously in definitions section:
instruction_name: type: ExploratoryAnalysis number_of_processes: 4 analyses: my_first_analysis: dataset: d1 report: r1 my_second_analysis: dataset: d1 encoding: e1 report: r2 labels: - CD - CMV
- parse(key: str, instruction: dict, symbol_table: immuneML.dsl.symbol_table.SymbolTable.SymbolTable, path: Optional[pathlib.Path] = None) immuneML.workflows.instructions.exploratory_analysis.ExploratoryAnalysisInstruction.ExploratoryAnalysisInstruction [source]
immuneML.dsl.instruction_parsers.LabelHelper module
- class immuneML.dsl.instruction_parsers.LabelHelper.LabelHelper[source]
Bases:
object
- static create_label_config(labels: list, dataset: immuneML.data_model.dataset.Dataset.Dataset, instruction_name: str, yaml_location: str) immuneML.environment.LabelConfiguration.LabelConfiguration [source]
immuneML.dsl.instruction_parsers.MLApplicationParser module
- class immuneML.dsl.instruction_parsers.MLApplicationParser.MLApplicationParser[source]
Bases:
object
Specification example for the MLApplication instruction:
instruction_name: type: MLApplication dataset: d1 config_path: ./config.zip number_of_processes: 4 label: CD
- parse(key: str, instruction: dict, symbol_table: immuneML.dsl.symbol_table.SymbolTable.SymbolTable, path: pathlib.Path) immuneML.workflows.instructions.ml_model_application.MLApplicationInstruction.MLApplicationInstruction [source]
immuneML.dsl.instruction_parsers.SimulationParser module
- class immuneML.dsl.instruction_parsers.SimulationParser.SimulationParser[source]
Bases:
object
YAML specification:
definitions: dataset: my_dataset: ... motifs: m1: seed: AAC # "/" character denotes the gap in the seed if present (e.g. AA/C) instantiation: GappedKmer: # probability that when hamming distance is allowed a letter in the seed will be replaced by # other alphabet letters - alphabet_weights alphabet_weights: A: 0.2 C: 0.2 D: 0.4 E: 0.2 # Relative probabilities of choosing each position in the seed for hamming distance modification. # The probabilities will be scaled to sum to one - position_weights position_weights: 0: 1 1: 0 2: 0 hamming_distance_probabilities: 0: 0.5 # Hamming distance of 0 (no change) with probability 0.5 1: 0.5 # Hamming distance of 1 (one letter change) with probability 0.5 min_gap: 0 max_gap: 1 signals: s1: motifs: # list of all motifs for signal which will be uniformly sampled to get a motif instance for implanting - m1 sequence_position_weights: # likelihood of implanting at IMGT position of receptor sequence 107: 0.5 implanting: HealthySequence # choose only sequences with no other signals for to implant one of the motifs simulations: sim1: # one Simulation object consists of a dict of Implanting objects i1: dataset_implanting_rate: 0.5 # percentage of repertoire where the signals will be implanted repertoire_implanting_rate: 0.01 # percentage of sequences within repertoire where the signals will be implanted signals: - s1 instructions: my_simulation_instruction: type: Simulation dataset: my_dataset simulation: sim1 export_formats: [AIRR, ImmuneML]
- parse(key: str, instruction: dict, symbol_table: immuneML.dsl.symbol_table.SymbolTable.SymbolTable, path: Optional[pathlib.Path] = None) immuneML.workflows.instructions.SimulationInstruction.SimulationInstruction [source]
immuneML.dsl.instruction_parsers.SubsamplingParser module
- class immuneML.dsl.instruction_parsers.SubsamplingParser.SubsamplingParser[source]
Bases:
object
- parse(key: str, instruction: dict, symbol_table: immuneML.dsl.symbol_table.SymbolTable.SymbolTable, path: Optional[pathlib.Path] = None) immuneML.workflows.instructions.subsampling.SubsamplingInstruction.SubsamplingInstruction [source]
immuneML.dsl.instruction_parsers.TrainMLModelParser module
- class immuneML.dsl.instruction_parsers.TrainMLModelParser.TrainMLModelParser[source]
Bases:
object
- parse(key: str, instruction: dict, symbol_table: immuneML.dsl.symbol_table.SymbolTable.SymbolTable, path: Optional[pathlib.Path] = None) immuneML.workflows.instructions.TrainMLModelInstruction.TrainMLModelInstruction [source]