immuneML.ml_methods.dim_reduction package

Submodules

immuneML.ml_methods.dim_reduction.DimRedMethod module

class immuneML.ml_methods.dim_reduction.DimRedMethod.DimRedMethod(name: str = None)[source]

Bases: ABC

Dimensionality reduction methods are algorithms which can be used to reduce the dimensionality of encoded datasets, in order to uncover and analyze patterns present in the data.

These methods can be used in the ExploratoryAnalysis and Clustering instructions.

DOCS_TITLE = 'Dimensionality reduction methods'
fit(dataset: Dataset = None, design_matrix: ndarray = None)[source]
fit_transform(dataset: Dataset = None, design_matrix: ndarray = None)[source]
abstract get_dimension_names() List[str][source]
get_explained_variance_ratio()[source]
classmethod get_title()[source]
inverse_transform(transformed_data)[source]
transform(dataset: Dataset = None, design_matrix: ndarray = None)[source]

immuneML.ml_methods.dim_reduction.KernelPCA module

class immuneML.ml_methods.dim_reduction.KernelPCA.KernelPCA(name: str = None, **kwargs)[source]

Bases: DimRedMethod

Kernel principal component analysis which wraps scikit-learn’s KernelPCA, allowing for non-linear dimensionality reduction. Input arguments for the method are the same as supported by scikit-learn (see KernelPCA scikit-learn documentation for details), plus two additional immuneML arguments:

  • components (list): which two components (1-indexed) to use for visualization in the DimensionalityReduction report. Default: [1, 2].

  • compute_total_variance (bool): if True, computes the total variance in kernel feature space during fit by building the full n_samples × n_samples kernel matrix, so that explained variance ratios are expressed as a fraction of total kernel-space variance rather than relative to the retained components only. This roughly doubles the fit computation time. Default: false.

YAML specification:

definitions:
    ml_methods:
        my_kernel_pca:
            KernelPCA:
                n_components: 5
                kernel: rbf
                components: [3, 4]
                compute_total_variance: false
fit(dataset: Dataset = None, design_matrix: ndarray = None)[source]
get_dimension_names() List[str][source]
get_explained_variance_ratio()[source]

immuneML.ml_methods.dim_reduction.PCA module

class immuneML.ml_methods.dim_reduction.PCA.PCA(name: str = None, **kwargs)[source]

Bases: DimRedMethod

Principal component analysis (PCA) method which wraps scikit-learn’s PCA. Input arguments for the method are the same as supported by scikit-learn (see PCA scikit-learn documentation for details), plus one additional immuneML argument:

  • components (list): which two components (1-indexed) to use for visualization in the DimensionalityReduction report. Default: [1, 2].

YAML specification:

definitions:
    ml_methods:
        my_pca:
            PCA:
                n_components: 5
                components: [3, 4]
get_dimension_names() List[str][source]
get_explained_variance_ratio()[source]

immuneML.ml_methods.dim_reduction.TSNE module

class immuneML.ml_methods.dim_reduction.TSNE.TSNE(name: str = None, **kwargs)[source]

Bases: DimRedMethod

t-distributed Stochastic Neighbor Embedding (t-SNE) method which wraps scikit-learn’s TSNE. It can be useful for visualizing high-dimensional data. Input arguments for the method are the same as supported by scikit-learn (see TSNE scikit-learn documentation for details), plus one additional immuneML argument:

  • components (list): which two components (1-indexed) to use for visualization in the DimensionalityReduction report. Default: [1, 2].

YAML specification:

definitions:
    ml_methods:
        my_tsne:
            TSNE:
                n_components: 2
                init: pca
                components: [1, 2]
get_dimension_names() List[str][source]
transform(dataset: Dataset = None, design_matrix: ndarray = None)[source]

immuneML.ml_methods.dim_reduction.UMAP module

class immuneML.ml_methods.dim_reduction.UMAP.UMAP(name: str = None, **kwargs)[source]

Bases: DimRedMethod

Uniform manifold approximation and projection (UMAP) method which wraps umap-learn’s UMAP. Input arguments for the method are the same as supported by umap-learn (see UMAP in the umap-learn documentation for details), plus one additional immuneML argument:

  • components (list): which two components (1-indexed) to use for visualization in the DimensionalityReduction report. Default: [1, 2].

Note that when providing the arguments for UMAP in the immuneML’s specification, it is not possible to set functions as input values (e.g., for the metric parameter, it has to be one of the predefined metrics available in umap-learn).

YAML specification:

definitions:
    ml_methods:
        my_umap:
            UMAP:
                n_components: 2
                n_neighbors: 15
                metric: euclidean
                components: [1, 2]
get_dimension_names() List[str][source]

Module contents