immuneML Galaxy tools¶
If you are unfamiliar with Galaxy, we recommend to first read Introduction to Galaxy.
Overview of Galaxy tool functionalities¶
Each immuneML Galaxy tool provides an interface to run a specific immuneML instruction or workflow. To quickly test different immuneML functionalities, the tools provide button-based interfaces with limited options. Alternatively, a YAML file may be used as input, which is identical to the YAML file used on the command line interface.
Galaxy tool |
immuneML instruction |
Interface type |
---|---|---|
Creates dataset and runs optional reports with ExploratoryAnalysis instruction |
Button or YAML-based |
|
Creates dataset with random dataset import |
Button or YAML-based |
|
Modifies dataset with LigoSim instruction |
Button or YAML-based |
|
YAML-based interface for training a classifier with TrainMLModel instruction |
YAML-based |
|
Simplified interface for training a classifier with TrainMLModel instruction with sequence/receptor dataset |
Button-based |
|
Simplified interface for training a classifier with TrainMLModel instruction with repertoire dataset |
Button-based |
|
Applies an ML classifier with MLApplication instruction |
Button-based |
|
Trains a generative model with TrainGenModel instruction |
Button or YAML-based |
|
Creates dataset with ApplyGenModel by applying a trained generative model |
Button-based |
|
Clusters a dataset with Clustering instruction |
Button or YAML-based |
|
Runs any instruction (recommended for e.g., advanced ExploratoryAnalysis, or instructions not covered by other tools) |
YAML-based |
immuneML datasets in Galaxy¶
In Galaxy, an immuneML dataset is a special type of history element, which internally contains an immuneML dataset stored in AIRR format. Datasets can be imported from files using the Create Dataset with Reports tool. Some other tools also produce (synthetic) immuneML datasets.
Tips for importing data:
If your dataset contains many files, you may want to consider using a Galaxy collection as input using a Galaxy collection as input.
For quick testing of Galaxy, a dataset of random sequences can quickly be generated using the Simulate a Random Dummy Dataset tool.
See How to import data into immuneML for general information about datasets in immuneML.
When running a YAML-based tool, the tool will ask you to select a dataset from the Galaxy history, and the YAML should contain the following snippet to ensure the selected dataset is imported:
definitions:
datasets:
dataset:
format: AIRR
params:
path: dataset.yaml
Galaxy tool input and output¶
Galaxy tools produce their output as history elements which can be viewed, downloaded, or used as input for subsequent tools. immuneML tools will output the following history elements:
A summary HTML file showing the results (or error in the case of a failed run). For tools generating datasets, the dataset element also serves as the HTML summary.
An archive containing the zipped folder with all internally generated results (identical to the results you get when running immuneML on the command line).
Each button-based tool will also return the YAML file that was generated based on the user options to run immuneML.
Classifiers or generative models generated by the respective tools (these may be used as input for subsequent tools).