immuneML.data_model.repertoire package
Submodules
immuneML.data_model.repertoire.Repertoire module
- class immuneML.data_model.repertoire.Repertoire.Repertoire(data_filename: pathlib.Path, metadata_filename: pathlib.Path, identifier: str)[source]
Bases:
immuneML.data_model.DatasetItem.DatasetItem
Repertoire object consisting of sequence objects, each sequence attribute is stored as a list across all sequences and can be loaded separately. Internally, this class relies on numpy to store/import_dataset the data.
- FIELDS = ('sequence_aas', 'sequences', 'v_genes', 'j_genes', 'v_subgroups', 'j_subgroups', 'v_alleles', 'j_alleles', 'chains', 'counts', 'region_types', 'frame_types', 'sequence_identifiers', 'cell_ids')
- classmethod build(sequence_aas: Optional[list] = None, sequences: Optional[list] = None, v_genes: Optional[list] = None, j_genes: Optional[list] = None, v_subgroups: Optional[list] = None, j_subgroups: Optional[list] = None, v_alleles: Optional[list] = None, j_alleles: Optional[list] = None, chains: Optional[list] = None, counts: Optional[list] = None, region_types: Optional[list] = None, frame_types: Optional[list] = None, custom_lists: Optional[dict] = None, sequence_identifiers: Optional[list] = None, path: Optional[pathlib.Path] = None, metadata: Optional[dict] = None, signals: Optional[dict] = None, cell_ids: Optional[List[str]] = None, filename_base: Optional[str] = None)[source]
- classmethod build_from_sequence_objects(sequence_objects: list, path: pathlib.Path, metadata: dict, filename_base: Optional[str] = None)[source]
- classmethod build_like(repertoire, indices_to_keep: list, result_path: pathlib.Path, filename_base: Optional[str] = None)[source]
- property cells: immuneML.data_model.cell.CellList.CellList
- A property that creates a list of Cell objects based on the cell_ids field in the following manner:
all sequences that have the same cell_id are grouped together
they are divided into groups based on the chain
all valid combinations of chains are created and used to make a receptor object - this means that if a cell has two beta (b1 and b2) and one alpha chain (a1), two receptor objects will be created: receptor1 (b1, a1), receptor2 (b2, a1)
an object of the Cell class is created from all receptors with the same cell_id created as described in the previous steps
To avoid have multiple receptors in the same cell, use some of the preprocessing classes which could merge/eliminate multiple sequences. See the documentation of the preprocessing module for more information.
- Returns
a list of objects of Cell class
- Return type
- static check_count(sequence_aas: Optional[list] = None, sequences: Optional[list] = None, custom_lists: Optional[dict] = None) int [source]
- get_sequence_objects(load_implants: bool = False) List[immuneML.data_model.receptor.receptor_sequence.ReceptorSequence.ReceptorSequence] [source]
Lazily loads sequences from disk to reduce RAM consumption
- Parameters
load_implants – whether implants should be parsed to objects and converted to ImplantAnnotations; if True, might slow down the loading
- Returns
a list of ReceptorSequence objects
- property receptors: List[immuneML.data_model.receptor.Receptor.Receptor]
- A property that creates a list of Receptor objects based on the cell_ids field in the following manner:
all sequences that have the same cell_id are grouped together
they are divided into groups based on the chain
all valid combinations of chains are created and used to make a receptor object - this means that if a cell has two beta (b1 and b2) and one alpha chain (a1), two receptor objects will be created: receptor1 (b1, a1), receptor2 (b2, a1)
To avoid have multiple receptors in the same cell, use some of the preprocessing classes which could merge/eliminate multiple sequences. See the documentation of the preprocessing module for more information.
- Returns
a list of objects of Receptor class
- Return type
ReceptorList
- property sequences