About the output

Explanation of output files

After processing, MassCube generates output files in the project folder as described below.

my_project/
├── data
├── sample_table.csv
├── parameters.csv
├── mzrt_list.csv
├── aligned_feature_table.txt
├── normalized_feature_table.txt
├── project_files
│   ├── features.msp
│   ├── aligned_feature_table_before_annotation.txt

│   └── ...
├── single_files
│   ├── sample1.txt
│   ├── sample2.txt
│   └── ...
├── chromatograms
│   ├── sample1.png
│   ├── sample2.png
│   └── ...
├── ms2_matching
│   ├── compound1.png
│   ├── compound2.png
│   └── ...
├── statistical_analysis
├── normalization_results
│   ├── feature_1_normalization.png
│   ├── feature_2_normalization.png
│   └── ...
├── ...

aligned_feature_table.txt

Samples are in columns, and features are in rows. From left to right, the columns are:

group_ID: the ID of the feature group. Features with the same group_ID are considered to originate from the same neutral compound (i.e., isotopes, adducts, and in-source fragments are grouped together).
feature_ID: the ID of features ranked by maximum intensity from high to low.
m/z: mass-to-charge ratio.
RT: retention time in minutes.
adduct: the adduct type.
is_isotope: whether the feature is an isotope (TRUE or FALSE).
is_in_source_fragment: whether the feature is an in-source fragment (TRUE or FALSE).
Gaussian_similarity: the similarity score (0–1) that measures how well the ion trace fits a Gaussian curve.
noise_score: the score (0–∞) that measures the level of noise. A noise score > 0.5 indicates considerable noise.
asymmetry_factor: the asymmetry factor of the peak.
peak_shape: the retention time–intensity pairs of the peak shape: RT1;intensity1 | RT2;intensity2 | …
detection_rate: the fraction of non-blank files in which the feature was detected.
detection_rate_gap_filled: after gap filling, the fraction of non-blank files in which the feature was detected.
alignment_reference_file: the reference file for alignment. It also provides the peak shape information.
charge: the number of charges.
isotopes: isotope distribution acquired from the alignment reference file: mz1;intensity1 | mz2;intensity2 | …
MS2_reference_file: the source file of the MS/MS spectrum.
MS2: the experimental MS/MS spectrum.
matched_MS2: the database MS/MS spectrum.
search_mode: identity_search, fuzzy_search, or mzRT_match.
annotation: the compound name.
formula: the molecular formula.
similarity: the similarity score (0–1) between the experimental and database MS/MS spectra.
matched_peak_number: the number of matched peaks.
SMILES: the SMILES representation of the compound.
InChIKey: the InChIKey representation of the compound.

If normalization is performed, the corresponding “normalized_feature_table.txt” file and “normalization_results” folder will be provided.

project_files

A folder that contains the project files including

features.msp: a MSP file containing all detected features for further analyses.
aligned_feature_table_before_annotation.txt: the unannotated feature table.
aligned_features.pkl: a pickle file containing the aligned features as a list.

single_files

A folder that contains txt files for each raw MS data after feature detection and segmentation.

chromatograms

A folder that contains base peak chromatograms for each raw MS data.

ms2_matching

A folder that contains mirror plots of MS/MS spectral matching results. Only identity search annotations will be plotted.

The workflow