immuneML.util package
Submodules
immuneML.util.AdaptiveImportHelper module
- class immuneML.util.AdaptiveImportHelper.AdaptiveImportHelper[source]
Bases:
object
- static parse_germline(dataframe: pandas.DataFrame, gene_name_replacement: dict, germline_value_replacement: dict)[source]
- static preprocess_dataframe(dataframe: pandas.DataFrame, params: immuneML.IO.dataset_import.DatasetImportParams.DatasetImportParams)[source]
immuneML.util.CompAIRRHelper module
immuneML.util.CompAIRRParams module
- class immuneML.util.CompAIRRParams.CompAIRRParams(compairr_path: pathlib.Path, keep_compairr_input: bool, differences: int, indels: bool, ignore_counts: bool, ignore_genes: bool, threads: int, output_filename: str, log_filename: str)[source]
Bases:
object
- compairr_path: pathlib.Path
- differences: int
- ignore_counts: bool
- ignore_genes: bool
- indels: bool
- keep_compairr_input: bool
- log_filename: str
- output_filename: str
- threads: int
immuneML.util.DistanceMetrics module
immuneML.util.DocEnumHelper module
immuneML.util.EncoderHelper module
- class immuneML.util.EncoderHelper.EncoderHelper[source]
Bases:
object
- static build_comparison_data(dataset: immuneML.data_model.dataset.RepertoireDataset.RepertoireDataset, params: immuneML.encodings.EncoderParams.EncoderParams, comparison_attributes, sequence_batch_size)[source]
- static prepare_training_ids(dataset: immuneML.data_model.dataset.Dataset.Dataset, params: immuneML.encodings.EncoderParams.EncoderParams)[source]
- static store(encoded_dataset, params: immuneML.encodings.EncoderParams.EncoderParams)[source]
immuneML.util.FilenameHandler module
- class immuneML.util.FilenameHandler.FilenameHandler[source]
Bases:
object
- static get_filename(class_name: str, file_type: str)[source]
converts the class name to snake case and appends given file type :param class_name: name of the class that will be stored in the file :param file_type: file extension: pickle, json :return: filename consisting of concatenated class_name in snake case and file type
immuneML.util.ImportHelper module
- class immuneML.util.ImportHelper.ImportHelper[source]
Bases:
object
- DATASET_FORMAT = 'iml_dataset'
- static build_receptor_from_rows(first_row, second_row, identifier, chain_pair, metadata_columns)[source]
- static drop_empty_sequences(dataframe: pandas.DataFrame, import_empty_aa_sequences: bool, import_empty_nt_sequences: bool) pandas.DataFrame [source]
- static drop_illegal_character_sequences(dataframe: pandas.DataFrame, import_illegal_characters: bool) pandas.DataFrame [source]
- static import_dataset(import_class, params: dict, dataset_name: str) <module 'immuneML.data_model.dataset.Dataset' from '/Users/milenpa/PycharmProjects/BMIImmuneML/immuneML/data_model/dataset/Dataset.py'> [source]
- static import_items(import_class, path, params: immuneML.IO.dataset_import.DatasetImportParams.DatasetImportParams)[source]
- static import_receptors(df, params) List[immuneML.data_model.receptor.Receptor.Receptor] [source]
- static import_receptors_by_id(df, identifier, chain_pair, metadata_columns) List[immuneML.data_model.receptor.Receptor.Receptor] [source]
- static import_repertoire_dataset(import_class, params: immuneML.IO.dataset_import.DatasetImportParams.DatasetImportParams, dataset_name: str) immuneML.data_model.dataset.RepertoireDataset.RepertoireDataset [source]
Function to create a dataset from the metadata and a list of repertoire files and exports dataset pickle file
- Parameters
import_class – class to use for import
params – instance of DatasetImportParams class which includes information on path, columns, result path etc.
dataset_name – user-defined name of the dataset
- Returns
RepertoireDataset object that was created
- static import_sequence(row, metadata_columns=None) immuneML.data_model.receptor.receptor_sequence.ReceptorSequence.ReceptorSequence [source]
- static junction_to_cdr3(df: pandas.DataFrame, region_type: immuneML.data_model.receptor.RegionType.RegionType)[source]
If RegionType is CDR3, the leading C and trailing W are removed from the sequence to match the IMGT CDR3 definition. This method alters the data in the provided dataframe.
- static load_repertoire_as_object(import_class, metadata_row, params: immuneML.IO.dataset_import.DatasetImportParams.DatasetImportParams)[source]
- static make_new_metadata_file(repertoires: list, metadata: pandas.DataFrame, result_path: pathlib.Path, dataset_name: str) pathlib.Path [source]
- static prepare_frame_type_list(params: immuneML.IO.dataset_import.DatasetImportParams.DatasetImportParams) list [source]
- static rename_dataframe_columns(df, params: immuneML.IO.dataset_import.DatasetImportParams.DatasetImportParams)[source]
- static safe_load_dataframe(filepath, params: immuneML.IO.dataset_import.DatasetImportParams.DatasetImportParams)[source]
- static strip_suffix(df: pandas.DataFrame, column_name, delimiter)[source]
Safely removes everything after a delimiter from a column in the DataFrame
- static update_gene_info(df: pandas.DataFrame)[source]
Updates gene info in 2 steps:
First, columns are added if they were not present. This is done by going from the highest level of information (alleles)
towards the lowest level of information (subgroups) by stripping away suffixes. If gene and subgroup columns were already present, suffixes are still stripped away just in case. - Next, if there are None values present, the highest possible level of information is copied in from the lower level information fields. This is done by moving from subgroups towards alleles. So if for one particular receptor only the subgroup was present, the subgroup will be copied into the genes and alleles column.
immuneML.util.KmerHelper module
- class immuneML.util.KmerHelper.KmerHelper[source]
Bases:
object
- static create_IMGT_gapped_kmers_from_sequence(sequence: immuneML.data_model.receptor.receptor_sequence.ReceptorSequence.ReceptorSequence, sequence_type: immuneML.environment.SequenceType.SequenceType, k_left: int, max_gap: int, k_right: Optional[int] = None, min_gap: int = 0)[source]
- static create_IMGT_kmers_from_sequence(sequence: immuneML.data_model.receptor.receptor_sequence.ReceptorSequence.ReceptorSequence, k: int, sequence_type: immuneML.environment.SequenceType.SequenceType)[source]
- static create_all_kmers(k: int, alphabet: list)[source]
creates all possible k-mers given a k-mer length and an alphabet :param k: length of k-mer (int) :param alphabet: list of characters from which to make all possible k-mers (list) :return: alphabetically sorted list of k-mers
- static create_gapped_kmers_from_sequence(sequence: immuneML.data_model.receptor.receptor_sequence.ReceptorSequence.ReceptorSequence, sequence_type: immuneML.environment.SequenceType.SequenceType, k_left: int, max_gap: int, k_right: Optional[int] = None, min_gap: int = 0)[source]
- static create_gapped_kmers_from_string(sequence, k_left: int, max_gap: int, k_right: Optional[int] = None, min_gap: int = 0)[source]
- static create_kmers_from_sequence(sequence: immuneML.data_model.receptor.receptor_sequence.ReceptorSequence.ReceptorSequence, k: int, sequence_type: immuneML.environment.SequenceType.SequenceType, overlap: bool = True)[source]
- static create_sentences_from_repertoire(repertoire: immuneML.data_model.repertoire.Repertoire.Repertoire, k: int, sequence_type: immuneML.environment.SequenceType.SequenceType, overlap: bool = True)[source]
immuneML.util.NameBuilder module
- class immuneML.util.NameBuilder.NameBuilder[source]
Bases:
object
- static build_name_from_dict(dictionary: dict, level=0)[source]
Creates a name from dictionary which includes all of its parameters and handles nested dictionaries up to depth of 10 inclusively
- Parameters
dictionary (dict) – dictionary to create the name from
level (int) – controls recursion level, user should keep default
- Returns
name (str)
immuneML.util.NumpyHelper module
immuneML.util.ParameterValidator module
- class immuneML.util.ParameterValidator.ParameterValidator[source]
Bases:
object
- static assert_all_in_valid_list(values: list, valid_values: list, location: str, parameter_name: str)[source]
- static assert_all_type_and_value(values, parameter_type, location: str, parameter_name: str, min_inclusive=None, max_inclusive=None)[source]
- static assert_keys(keys, valid_keys, location: str, parameter_name: str, exclusive: bool = True)[source]
immuneML.util.PathBuilder module
immuneML.util.PositionHelper module
- class immuneML.util.PositionHelper.PositionHelper[source]
Bases:
object
- static adjust_position_weights(sequence_position_weights: dict, imgt_positions, limit: int) dict [source]
- Parameters
sequence_position_weights – weights supplied by the user as to where in the receptor_sequence to implant
imgt_positions – IMGT positions present in the specific receptor_sequence
limit – how far from the end of the receptor_sequence the motif at latest must start in order not to elongate the receptor_sequence
- Returns
position_weights for implanting a motif instance into a receptor_sequence
- static build_position_weights(sequence_position_weights: dict, imgt_positions, limit: int) dict [source]
- static gen_imgt_positions_from_sequence(sequence: immuneML.data_model.receptor.receptor_sequence.ReceptorSequence.ReceptorSequence)[source]
immuneML.util.ReflectionHandler module
- class immuneML.util.ReflectionHandler.ReflectionHandler[source]
Bases:
object
- static get_class_from_path(path, class_name: Optional[str] = None)[source]
obtain the class reference from the given path
- Parameters
path (str or pathlib.Path) – path to file where the class is located
class_name (str) – class name to import_dataset from the file; if None, it is assumed that the class name is the same as the file name
- Returns
class
immuneML.util.RepertoireBuilder module
immuneML.util.SequenceAnalysisHelper module
- class immuneML.util.SequenceAnalysisHelper.SequenceAnalysisHelper[source]
Bases:
object
- static compute_overlap_matrix(hp_items: List[immuneML.hyperparameter_optimization.states.HPItem.HPItem])[source]
immuneML.util.StringHelper module
immuneML.util.TCRdistHelper module
- class immuneML.util.TCRdistHelper.TCRdistHelper[source]
Bases:
object
- static compute_tcr_dist(dataset: immuneML.data_model.dataset.ReceptorDataset.ReceptorDataset, labels: list, cores: int = 1)[source]
- static prepare_tcr_dist_dataframe(dataset: immuneML.data_model.dataset.ReceptorDataset.ReceptorDataset, labels: list) pandas.DataFrame [source]