Quick start

Quick Start

Let’s dive into the untargeted metabolomics workflow, designed to simplify and streamline untargeted metabolomics analysis. This powerful workflow delivers comprehensive results with just a single command.

If you haven’t installed MassCube yet, be sure to follow the installation guide before proceeding.

The MassCube untargeted metabolomics workflow

The workflow integrates metadata curation, feature detection, evaluation, alignment, annotation, and statistical analysis to provide users with a comprehensive view of the data (Fig. 1).

Fig. 1. The MassCube untargeted metabolomics workflow

Input (3+1)

You need three components for the project plus one MS/MS library for annotation.

In your project folder (e.g. my_project), you need to prepare the following components:

my_project
├── data
│   ├── sample1.mzML
│   ├── sample2.mzML
|   └── ...
|── sample_table.csv
└── parameters.csv
  1. data folder: a file folder containing all raw LC-MS data in .mzML or .mzXML format. It’s mandatory. Instructions for file conversion are provided here.

  2. sample_table.csv file: a csv file to claim the sample groups including biological groups, quality control samples, or blank samples. A template can be downloaded from here. You could also use MassCube to generate a sample table and edit. If not provided, normalization and statistical analysis will not be applied. Note: In sample table, please specify if a sample is blank or qc from the “is_blank” and “is_qc” columns, respectively.

  3. parameters.csv file: a csv file to set parameters for the workflow. You can set parameters and download the file for the workflow from here or download a template here. If not provided, the default parameters will be applied, yet annotation will not be performed since the MS/MS library is not provided.

  4. MS2 database: To annotate MS/MS spectra, you need to download a MS/MS library from here. For faster database loading, please download and use the .pkl format.

Choose the right MS/MS database version
For MassCube version 1.2.0 or later, please use New MS/MS Databases For the earlier version, please use Old MS/MS Databases

Extra component for annotation:

  1. mzrt_list.csv file: a csv file to provide the m/z and retention time for feature annotation. It was designed to annotate features using retention time (e.g. internal standards). A template can be downloaded from here. It’s optional.

Processing

In the project folder, open a terminal and run the following command:

untargeted-metabolomics
How to open the terminal
Make sure the terminal directory is set to the project folder. For Windows user and MacOS user

Output

After the processing, you will find the following files and folders in the project folder:

project/
├── data
├── sample_table.csv
├── parameters.csv
├── mzrt_list.csv (optional)
├── project_files
│   ├── data_processing_metadata_[DATE].pkl
│   ├── features.msp
│   └── ...
├── aligned_feature_table.txt
|── normalized_feature_table.txt (if signal normalization applied)
├── single_files
│   ├── sample1.txt
│   ├── sample2.txt
│   └── ...
├── chromatograms
│   ├── sample1.png
│   ├── sample2.png
│   └── ...
├── ms2_matching
│   ├── compound1.png
│   ├── compound2.png
│   └── ...
├── statistical_analysis
├── normalization results
|   ├──feature_0_normalization.png
|   ├──feature_1_normalization.png
|   └── ...
├── ...
  1. project_files folder: a folder containing the metadata file for data processing.
  2. aligned_feature_table.txt file: feature table after alignment (if applied).
  3. single_files folder: a folder containing the feature table for each sample.
  4. chromatograms folder: a folder containing the chromatogram for each sample.
  5. ms2_matching folder: a folder containing the MS/MS matching for each annotated compound.
  6. statistical_analysis folder: a folder containing the statistical analysis results.
  7. normalization results folder: a folder containing the normalization results (if applied).