immuneML.simulation.implants package

Submodules

immuneML.simulation.implants.ImplantAnnotation module

class immuneML.simulation.implants.ImplantAnnotation.ImplantAnnotation(signal_id: str = None, motif_id: str = None, motif_instance: str = None, position: int = None)[source]

Bases: object

motif_id: str = None
motif_instance: str = None
position: int = None
signal_id: str = None

immuneML.simulation.implants.Motif module

class immuneML.simulation.implants.Motif.Motif(identifier: str)[source]

Bases: object

Motifs are the objects which are implanted into sequences during simulation. They are defined under definitions/motifs. There are several different motif types, each having their own parameters.

abstract get_all_possible_instances(sequence_type: SequenceType)[source]
abstract get_alphabet() List[str][source]
abstract get_max_length() int[source]
identifier: str
abstract instantiate_motif(sequence_type: SequenceType = SequenceType.AMINO_ACID) MotifInstance[source]

immuneML.simulation.implants.MotifInstance module

class immuneML.simulation.implants.MotifInstance.MotifInstance(instance: str, gap: int)[source]

Bases: object

class immuneML.simulation.implants.MotifInstance.MotifInstanceGroup(iterable=(), /)[source]

Bases: list

immuneML.simulation.implants.Signal module

class immuneML.simulation.implants.Signal.Signal(id: str, motifs: List[Motif | List[Motif]] = None, sequence_position_weights: dict = None, v_call: str = None, j_call: str = None, clonal_frequency: dict = None, is_present_custom_func: Callable = None)[source]

Bases: object

A signal represents a collection of motifs, and optionally, position weights showing where one of the motifs of the signal can occur in a sequence. The signals are defined under definitions/signals.

A signal is associated with a metadata label, which is assigned to a receptor or repertoire. For example antigen-specific/disease-associated (receptor) or diseased (repertoire).

Note

IMGT positions

To use sequence position weights, IMGT positions should be explicitly specified as strings, under quotation marks, to allow for all positions to be properly distinguished.

Specification arguments:

  • motifs (list): A list of the motifs associated with this signal, either defined by seed or by position weight matrix. Alternatively, it can be a list of a list of motifs, in which case the motifs in the same sublist (max 2 motifs) have to co-occur in the same sequence

  • sequence_position_weights (dict): a dictionary specifying for each IMGT position in the sequence how likely it is for the signal to be there. If the position is not present in the sequence, the probability of the signal occurring at that position will be redistributed to other positions with probabilities that are not explicitly set to 0 by the user.

  • v_call (str): V gene with allele if available that has to co-occur with one of the motifs for the signal to exist; can be used in combination with rejection sampling, or full sequence implanting, otherwise ignored; to match in a sequence for rejection sampling, it is checked if this value is contained in the same field of generated sequence;

  • j_call (str): J gene with allele if available that has to co-occur with one of the motifs for the signal to exist; can be used in combination with rejection sampling, or full sequence implanting, otherwise ignored; to match in a sequence for rejection sampling, it is checked if this value is contained in the same field of generated sequence;

  • source_file (str): path to the file where the custom signal function is; cannot be combined with the arguments listed above (motifs, v_call, j_call, sequence_position_weights)

  • is_present_func (str): name of the function from the source_file file that will be used to specify the signal; the function’s signature must be:

def is_present(sequence_aa: str, sequence: str, v_call: str, j_call: str) -> bool:
    # custom implementation where all or some of these arguments can be used
clonal_frequency:
  a: 2 # shape parameter of the distribution
  loc: 0 # 0 by default but can be used to shift the distribution

YAML specification:

definitions:
    signals:
        my_signal:
            motifs:
                - my_simple_motif
                - my_gapped_motif
            sequence_position_weights:
                '109': 0.5
                '110': 0.5
            v_call: TRBV1
            j_call: TRBJ1
            clonal_frequency:
                a: 2
                loc: 0
        signal_with_custom_func:
            source_file: signal_func.py
            is_present_func: is_signal_present
            clonal_frequency:
                a: 2
                loc: 0
clonal_frequency: dict = None
get_all_motif_instances(sequence_type: SequenceType)[source]
id: str
is_present_custom_func: Callable = None
j_call: str = None
make_motif_instances(count, sequence_type: SequenceType)[source]
motifs: List[Motif | List[Motif]] = None
sequence_position_weights: dict = None
v_call: str = None
class immuneML.simulation.implants.Signal.SignalPair(signal1: immuneML.simulation.implants.Signal.Signal, signal2: immuneML.simulation.implants.Signal.Signal)[source]

Bases: object

property clonal_frequency
property id: str
property j_call
signal1: Signal
signal2: Signal
property v_call

Module contents