feature_grouping
Overview
The feature_grouping module provides functions to group features based on their m/z values, retention times, MS2 data, and scan-to-scan correlation. The module is designed for untargeted metabolomics workflows to group features that may represent the same compound, isotopes, in-source fragments, or adducts. The functions in this module are used to group features based on the reference file or within a single file.
Functions
group_features_after_alignment
group_features_after_alignment(features, params: Params)
Groups features after alignment based on the reference file. This function requires reloading the raw data to examine the scan-to-scan correlation between features. The annotated feature groups are stored in the feature_group_id attribute of the AlignedFeature objects.
Parameters:
- features(list): A list of AlignedFeature objects.
- params(Params object): A Params object that contains the parameters for feature grouping.
group_features_single_file
group_features_single_file(d)
Groups features from a single file based on the m/z, retention time, MS2, and scan-to-scan correlation. The annotated feature groups are stored in the feature_group_id attribute of the Feature objects.
Parameters:
- d(MSData object): An MSData object containing the detected ROIs to be grouped.
generate_search_dict
generate_search_dict(feature, adduct_form, ion_mode)
Generates a search dictionary for feature grouping based on the adduct form and ionization mode.
Parameters:
- feature(Feature object): The feature object to be grouped.
- adduct_form(str): The adduct form of the feature.
- ion_mode(str): The ionization mode, either “positive” or “negative”.
Returns:
- dict: A dictionary containing the possible adducts and in-source fragments.
find_isotope_signals
find_isotope_signals(mz, signals, mz_tol=0.015, charge_state=1, num=5)
Finds isotope patterns from the MS1 signals based on the m/z value and intensity.
Parameters:
- mz(float): The m/z value of the feature.
- signals(np.array): The MS1 signals as [[m/z, intensity], …].
- mz_tol(float): The m/z tolerance to find isotopes (default 0.015 Da).
- charge_state(int): The charge state of the feature (default 1).
- num(int): The maximum number of isotopes to be found (default 5).
Returns:
- numpy.array: The m/z and intensity of the isotopes.
scan_to_scan_cor_intensity
scan_to_scan_cor_intensity(a, b)
Calculates the scan-to-scan correlation between two features using Pearson correlation based on their intensity profiles.
Parameters:
- a(np.array): Intensity array of the first m/z.
- b(np.array): Intensity array of the second m/z.
Returns:
- float: The scan-to-scan correlation between the two features.
scan_to_scan_correlation
scan_to_scan_correlation(feature_a, feature_b)
Calculates the scan-to-scan correlation between two features using Pearson correlation based on their intensity profiles.
Parameters:
- feature_a(Feature object): The first feature object.
- feature_b(Feature object): The second feature object.
Returns:
- float: The scan-to-scan correlation between the two features.
get_charge_state
get_charge_state(mz_seq, valid_charge_states=[1,2])
Determines the charge state of the isotopes based on the m/z sequence.
Parameters:
- mz_seq(list): A list of m/z values of isotopes.
- valid_charge_states(list): A list of valid charge states (default [1,2]).
Returns:
- int: The charge state of the isotopes.
Constants
ADDUCT_POS
A dictionary of positive adducts with the adduct form as the key and the m/z shift, charge state, and multiplier as the values.
ADDUCT_NEG
A dictionary of negative adducts with the adduct form as the key and the m/z shift, charge state, and multiplier as the values.