immuneML.dsl package
Subpackages
- immuneML.dsl.definition_parsers package
- Submodules
- immuneML.dsl.definition_parsers.DefinitionParser module
- immuneML.dsl.definition_parsers.DefinitionParserOutput module
- immuneML.dsl.definition_parsers.EncodingParser module
- immuneML.dsl.definition_parsers.MLParser module
- immuneML.dsl.definition_parsers.MotifParser module
- immuneML.dsl.definition_parsers.PreprocessingParser module
- immuneML.dsl.definition_parsers.ReportParser module
- immuneML.dsl.definition_parsers.SignalParser module
- immuneML.dsl.definition_parsers.SimulationParser module
- Module contents
- immuneML.dsl.import_parsers package
- immuneML.dsl.instruction_parsers package
- Submodules
- immuneML.dsl.instruction_parsers.DatasetExportParser module
- immuneML.dsl.instruction_parsers.ExploratoryAnalysisParser module
- immuneML.dsl.instruction_parsers.LabelHelper module
- immuneML.dsl.instruction_parsers.MLApplicationParser module
- immuneML.dsl.instruction_parsers.SimulationParser module
- immuneML.dsl.instruction_parsers.SubsamplingParser module
- immuneML.dsl.instruction_parsers.TrainMLModelParser module
- Module contents
- immuneML.dsl.semantic_model package
- immuneML.dsl.symbol_table package
Submodules
immuneML.dsl.DefaultParamsLoader module
immuneML.dsl.ImmuneMLParser module
- class immuneML.dsl.ImmuneMLParser.ImmuneMLParser[source]
Bases:
object
Simple DSL parser from python dictionary or equivalent YAML for configuring repertoire / receptor_sequence classification in the (simulated) settings
DSL example with hyper-parameter optimization:
definitions: datasets: d1: format: MiXCR params: result_path: loaded_dataset/ region_type: IMGT_CDR3 path: path_to_files/ metadata_file: metadata.csv encodings: e1: KmerFrequency k: 3 e2: Word2Vec: vector_size: 16 context: sequence ml_methods: log_reg1: LogisticRegression: C: 0.001 reports: r1: SequenceLengthDistribution preprocessing_sequences: seq1: - filter_chain_B: ChainRepertoireFilter: keep_chain: A - filter_clonotype: ClonesPerRepertoireFilter: lower_limit: 1000 seq2: - filter_clonotype: ClonesPerRepertoireFilter: lower_limit: 500 - filter_chain_A: ChainRepertoireFilter: keep_chain: B instructions: inst1: type: TrainMLModel settings: - preprocessing: seq1 encoding: e1 ml_method: log_reg1 - preprocessing: seq2 encoding: e2 ml_method: log_reg1 assessment: split_strategy: random split_count: 1 training_percentage: 70 reports: data: [] data_splits: [] encoding: [] models: [] selection: split_strategy: k-fold split_count: 5 reports: data: [] data_splits: [r1] encoding: [] models: [] labels: - CD dataset: d1 strategy: GridSearch metrics: [accuracy, f1_micro] optimization_metric: balanced_accuracy reports: [] output: # this section can also be omitted, in that case output will be automatically HTML format: HTML # or None