Welcome to the immuneML documentation!¶

immuneML is a platform for machine learning-based analysis and classification of adaptive immune receptors and repertoires (AIRR). To get started using immuneML right away, check out our Quickstart tutorial.

immuneML can be used for:

Exploratory analysis of datasets such as dataset overview, statistical analyses, visualizations to get the overview of the data;
Clustering analysis of datasets to examine whether the examples form any clusters, how stable the clusters are, how much the clusters correspond to any of the external labels available;
Training classification ML models for repertoire classification (e.g., disease prediction) or receptor sequence classification (e.g., antigen binding prediction), and applying them to new datasets with unknown class labels;
Simulating datasets for ML model benchmarking with known ground truth immune signals using LIgO;
Training generative ML models of receptor sequences and evaluating synthetic sequences across a range of characteristics.

The starting point for any immuneML analysis is the YAML specification file. In this file, the settings of the analysis components are defined (also known as definitions), which are shown in six different colors in the figure below. Additionally, the YAML file describes one or more instructions, which corresponds to one of the applications listed above (and some additional instructions).

immuneML usage overview — An overview of immuneML usage: analysis components and instructions are specified in a YAML file. Each use case corresponds to a different instruction. The results of the instructions are summarized and presented in an HTML file.¶

Getting started¶

If you want to use immuneML locally, see Installing immuneML.

To become familiar with the YAML specification, you can find a concrete example in our Quickstart guide, or read about the overall YAML structure and options in How to specify an analysis with YAML.

Alternatively, to run immuneML in a web browser, go to our Galaxy Portal. Here, we offer the same functionalities as in the command-line interface (using YAML specifications), and in addition simplified button-based interfaces for training classifiers. See the immuneML & Galaxy tutorials for more information.

immuneML can be applied to a wide variety of use cases. To help you get started, we offer Tutorials for some common applications (e.g., how to train models, or how to simulate synthetic data for benchmarking). For more experienced users who want to customize their analysis and are wondering about all the possible analysis components and their options, you can find the complete list and documentation under YAML specification.

Our open-source code can be found on GitHub :)

Previous versions¶

Documentation for previous immuneML versions can be found here: