immuneML.data_model.repertoire package
Submodules
immuneML.data_model.repertoire.Repertoire module
- class immuneML.data_model.repertoire.Repertoire.Repertoire(data_filename: Path, metadata_filename: Path, identifier: str)[source]
Bases:
DatasetItem
Repertoire object consisting of sequence objects, each sequence attribute is stored as a list across all sequences and can be loaded separately. Internally, this class relies on numpy to store/import_dataset the data.
- FIELDS = ('sequence_aas', 'sequences', 'v_genes', 'j_genes', 'v_subgroups', 'j_subgroups', 'v_alleles', 'j_alleles', 'chains', 'counts', 'region_types', 'frame_types', 'sequence_identifiers', 'cell_ids')
- classmethod build(sequence_aas: list = None, sequences: list = None, v_genes: list = None, j_genes: list = None, v_subgroups: list = None, j_subgroups: list = None, v_alleles: list = None, j_alleles: list = None, chains: list = None, counts: list = None, region_types: list = None, frame_types: list = None, custom_lists: dict = None, sequence_identifiers: list = None, path: Path = None, metadata: dict = None, signals: dict = None, cell_ids: List[str] = None, filename_base: str = None)[source]
- classmethod build_from_sequence_objects(sequence_objects: list, path: Path, metadata: dict, filename_base: str = None)[source]
- classmethod build_like(repertoire, indices_to_keep: list, result_path: Path, filename_base: str = None)[source]
- property cells: CellList
- A property that creates a list of Cell objects based on the cell_ids field in the following manner:
all sequences that have the same cell_id are grouped together
they are divided into groups based on the chain
all valid combinations of chains are created and used to make a receptor object - this means that if a cell has two beta (b1 and b2) and one alpha chain (a1), two receptor objects will be created: receptor1 (b1, a1), receptor2 (b2, a1)
an object of the Cell class is created from all receptors with the same cell_id created as described in the previous steps
To avoid have multiple receptors in the same cell, use some of the preprocessing classes which could merge/eliminate multiple sequences. See the documentation of the preprocessing module for more information.
- Returns:
a list of objects of Cell class
- Return type:
- static check_count(sequence_aas: list = None, sequences: list = None, custom_lists: dict = None) int [source]
- get_sequence_objects(load_implants: bool = True) List[ReceptorSequence] [source]
Lazily loads sequences from disk to reduce RAM consumption
- Parameters:
load_implants – whether implants should be parsed to objects and converted to ImplantAnnotations; if True, might slow down the loading
- Returns:
a list of ReceptorSequence objects
- property receptors: List[Receptor]
- A property that creates a list of Receptor objects based on the cell_ids field in the following manner:
all sequences that have the same cell_id are grouped together
they are divided into groups based on the chain
all valid combinations of chains are created and used to make a receptor object - this means that if a cell has two beta (b1 and b2) and one alpha chain (a1), two receptor objects will be created: receptor1 (b1, a1), receptor2 (b2, a1)
To avoid have multiple receptors in the same cell, use some of the preprocessing classes which could merge/eliminate multiple sequences. See the documentation of the preprocessing module for more information.
- Returns:
a list of objects of Receptor class
- Return type:
ReceptorList
- property sequences