immuneML.data_model.repertoire package

Submodules

immuneML.data_model.repertoire.Repertoire module

class immuneML.data_model.repertoire.Repertoire.Repertoire(data_filename: pathlib.Path, metadata_filename: pathlib.Path, identifier: str)[source]

Bases: immuneML.data_model.DatasetItem.DatasetItem

Repertoire object consisting of sequence objects, each sequence attribute is stored as a list across all sequences and can be loaded separately. Internally, this class relies on numpy to store/import_dataset the data.

FIELDS = ('sequence_aas', 'sequences', 'v_genes', 'j_genes', 'v_subgroups', 'j_subgroups', 'v_alleles', 'j_alleles', 'chains', 'counts', 'region_types', 'frame_types', 'sequence_identifiers', 'cell_ids')
classmethod build(sequence_aas: Optional[list] = None, sequences: Optional[list] = None, v_genes: Optional[list] = None, j_genes: Optional[list] = None, v_subgroups: Optional[list] = None, j_subgroups: Optional[list] = None, v_alleles: Optional[list] = None, j_alleles: Optional[list] = None, chains: Optional[list] = None, counts: Optional[list] = None, region_types: Optional[list] = None, frame_types: Optional[list] = None, custom_lists: Optional[dict] = None, sequence_identifiers: Optional[list] = None, path: Optional[pathlib.Path] = None, metadata: Optional[dict] = None, signals: Optional[dict] = None, cell_ids: Optional[List[str]] = None, filename_base: Optional[str] = None)[source]
classmethod build_from_sequence_objects(sequence_objects: list, path: pathlib.Path, metadata: dict, filename_base: Optional[str] = None)[source]
classmethod build_like(repertoire, indices_to_keep: list, result_path: pathlib.Path, filename_base: Optional[str] = None)[source]
property cells: immuneML.data_model.cell.CellList.CellList
A property that creates a list of Cell objects based on the cell_ids field in the following manner:
  • all sequences that have the same cell_id are grouped together

  • they are divided into groups based on the chain

  • all valid combinations of chains are created and used to make a receptor object - this means that if a cell has two beta (b1 and b2) and one alpha chain (a1), two receptor objects will be created: receptor1 (b1, a1), receptor2 (b2, a1)

  • an object of the Cell class is created from all receptors with the same cell_id created as described in the previous steps

To avoid have multiple receptors in the same cell, use some of the preprocessing classes which could merge/eliminate multiple sequences. See the documentation of the preprocessing module for more information.

Returns

a list of objects of Cell class

Return type

CellList

static check_count(sequence_aas: Optional[list] = None, sequences: Optional[list] = None, custom_lists: Optional[dict] = None) int[source]
free_memory()[source]
get_attribute(attribute)[source]
get_attributes(attributes: list)[source]
get_chains()[source]
get_counts()[source]
get_element_count()[source]
get_j_genes()[source]
get_sequence_aas()[source]
get_sequence_identifiers()[source]
get_sequence_objects(load_implants: bool = False) List[immuneML.data_model.receptor.receptor_sequence.ReceptorSequence.ReceptorSequence][source]

Lazily loads sequences from disk to reduce RAM consumption

Parameters

load_implants – whether implants should be parsed to objects and converted to ImplantAnnotations; if True, might slow down the loading

Returns

a list of ReceptorSequence objects

get_v_genes()[source]
load_data()[source]
static process_custom_lists(custom_lists)[source]
property receptors: List[immuneML.data_model.receptor.Receptor.Receptor]
A property that creates a list of Receptor objects based on the cell_ids field in the following manner:
  • all sequences that have the same cell_id are grouped together

  • they are divided into groups based on the chain

  • all valid combinations of chains are created and used to make a receptor object - this means that if a cell has two beta (b1 and b2) and one alpha chain (a1), two receptor objects will be created: receptor1 (b1, a1), receptor2 (b2, a1)

To avoid have multiple receptors in the same cell, use some of the preprocessing classes which could merge/eliminate multiple sequences. See the documentation of the preprocessing module for more information.

Returns

a list of objects of Receptor class

Return type

ReceptorList

property sequences

Module contents