Data preparation

Data preparation

File conversion

Raw LC-MS data need to be converted to mzML or mxXML format for further processing in MassCube.

Currently, only centroid data are supported in MassCube.

We recommend to use MSConvert for file conversion.

Download and install MSConvert

Visit the official website to download ProteoWizard.

File conversion using MSConvert

Fig. 1. MSConvert GUI

Step 1. Set options

Check the boxes as shown in Fig. 1.

⚠️
Do NOT check Use Zlib compression.

Step 2. Set the Peak Picking filter

Step 3. Add the Peak Picking filter

Step 4. Browse and load files

Step 5. Start conversion

By default, the converted files will be saved in the same directory as the raw files.

Convert files in command line mode
You can also convert files using MSConvert in command line mode. For more information, please refer to the documentation.

Parameter file

A parameter file (.csv) is used to set parameters for the workflow. A templete is provided here. If not provided, the default parameters will be applied. For MassCube version 1.0 or earlier, please use this template.

Sample table

A sample table (.csv) is used to claim the name of samples and their groups including biological groups, quality control samples, or blank samples. A templete is provided here. For MassCube version 1.0 or earlier, please use this template

For large-scale metabolomics data, it’s not easy to prepare the sample table manually. In MassCube, we provide a function to automatically generate the sample table based on the file names in the data folder, and users can further define the groups.

Automatically generate sample table

In the project folder, open a terminal and run the following command:

generate-sample-table

Your project folder should include a subfolder named data that contains the raw LC-MS data files in mzML or mzXML format.

my_project
├── data
│ ├── sample1.mzML
│ ├── sample2.mzML
| └── ...
|── ...
You need to further edit the generated sample table to specify QCs, blanks and sample groups.