immuneML.data_model.encoded_data package
Submodules
immuneML.data_model.encoded_data.EncodedData module
- class immuneML.data_model.encoded_data.EncodedData.EncodedData(examples, labels: dict = None, example_ids: list = None, feature_names: list = None, feature_annotations: pandas.DataFrame = None, encoding: str = None, info: dict = None)[source]
Bases:
object
When a dataset is encoded, it is stored in an object of EncodedData class.
- Parameters:
examples – a matrix of example_count x feature_count elements (can be a numpy array or a sparse matrix); there are some exceptions to this, for instance,
source.encodings.onehot.OneHotEncoder.OneHotEncoder
where the numpy array has more than two dimensions, but most of the encodings follow the matrix format.feature_names – a list of feature names with feature_count elements
feature_annotations – a data frame consisting of annotations for each unique feature
example_ids – a list of example (repertoire/sequence/receptor) IDs; it must be the same length as the example_count in the examples matrix
labels – a dict of labels where label names are keys and the values are lists of values for the label across examples: {label_name1: […], label_name2: […]}. Each list associated with a label has to have values for all examples.