immuneML.analysis package
Subpackages
- immuneML.analysis.criteria_matches package
- Submodules
- immuneML.analysis.criteria_matches.BooleanType module
- immuneML.analysis.criteria_matches.CriteriaMatcher module
- immuneML.analysis.criteria_matches.CriteriaTypeInstantiator module
- immuneML.analysis.criteria_matches.DataType module
- immuneML.analysis.criteria_matches.OperationType module
- Module contents
- immuneML.analysis.data_manipulation package
- immuneML.analysis.entropy_calculations package
- immuneML.analysis.similarities package
Submodules
immuneML.analysis.AxisType module
immuneML.analysis.SequenceMatcher module
- class immuneML.analysis.SequenceMatcher.SequenceMatcher[source]
Bases:
object
Matches the sequences across the given list of reference sequences (a list of ReceptorSequence objects) and returns the following information: {
- “repertoires”:[{
- “sequences”: [{
“sequence”: “AAA”, “matching_sequences”: [“AAA”, “AAC”], “v_gene”: “V12”, “j_gene”: “J3”, “chain”: “A”
}], # list of sequences for the repertoire with matched sequences for each original sequence “repertoire”: “fdjshfk321231”, # repertoire identifier “repertoire_index”: 2, # the index of the repertoire in the dataset, “sequences_matched”: 4, # number of sequences from the repertoire which are a match for at least one reference sequence “percentage_of_sequences_matched”: 0.75, # percentage of sequences from the repertoire that have at least one match in the reference sequences “metadata”: {“CD”: True}, # dict with parameters that can be used for analysis on repertoire level and that serve as a starting point for label configurations “chains”: [“A”,”B”] # list of chains in the repertoire
}, …]
}
- CORES = 4
- match(dataset: immuneML.data_model.dataset.RepertoireDataset.RepertoireDataset, reference_sequences: list, max_distance: int, summary_type: immuneML.encodings.reference_encoding.SequenceMatchingSummaryType.SequenceMatchingSummaryType) dict [source]
- match_repertoire(repertoire: immuneML.data_model.repertoire.Repertoire.Repertoire, index: int, reference_sequences: list, max_distance: int, summary_type: immuneML.encodings.reference_encoding.SequenceMatchingSummaryType.SequenceMatchingSummaryType) dict [source]
- match_sequence(sequence: immuneML.data_model.receptor.receptor_sequence.ReceptorSequence.ReceptorSequence, reference_sequences: list, max_distance: int) dict [source]
- matches_sequence(original_sequence: immuneML.data_model.receptor.receptor_sequence.ReceptorSequence.ReceptorSequence, reference_sequence: immuneML.data_model.receptor.receptor_sequence.ReceptorSequence.ReceptorSequence, max_distance)[source]
- Parameters
original_sequence – ReceptorSequence
reference_sequence – ReceptorSequence
max_distance – max allowed Levenshtein distance between two sequences to be considered a match
- Returns
True if chain, v_gene and j_gene are the same and sequences are within given Levenshtein distance