feature_detection
Overview
Untargeted feature detection.
Classes
Feature
A class to store a feature characterized by a unique pair of m/z and retention time.
Attributes:
rt_seq(list of float): Retention time sequence.signals(list): Signal sequence organized as [[m/z, intensity], …].scan_idx_seq(list of int): Scan index sequence.ms2_seq(list): MS2 spectra.gap_counter(int): Counter for the number of consecutive zeros in the peak tail.id(int): Feature ID.feature_group_id(int): Peak group ID.mz(float): m/z value.rt(float): Retention time.scan_idx(int): Scan index of the peak apex.peak_height(float): Peak height.peak_area(float): Peak area.top_average(float): Average of the highest three intensities.ms2(object): Representative MS2 spectrum.length(int): Number of valid scans in the feature.gaussian_similarity(float): Gaussian similarity.noise_score(float): Noise score.asymmetry_factor(float): Asymmetry factor.sse(float): Squared error to the smoothed curve.is_segmented(bool): Indicates if the feature is segmented.is_isotope(bool): Indicates if the feature is an isotope.charge_state(int): Charge state of the feature.isotope_signals(list): Isotope signals [[m/z, intensity], …].is_in_source_fragment(bool): Indicates if the feature is an in-source fragment.adduct_type(str): Adduct type.annotation_algorithm(str): Annotation algorithm.search_mode(str): Search mode (‘identity search’, ‘fuzzy search’, or ‘mzrt_search’).similarity(float): Similarity score (0-1).annotation(str): Name of annotated compound.formula(str): Molecular formula.matched_peak_number(int): Number of matched peaks.smiles(str): SMILES notation.inchikey(str): InChIKey notation.matched_precursor_mz(float): Matched precursor m/z.matched_ms2(object): Matched MS2 spectra.matched_adduct_type(str): Matched adduct type.
Methods:
extend(self, rt, signal, scan_idx): Extends the chromatographic peak with new data points.get_mz_error(self): Calculates the 3*sigma error of the feature’s m/z.get_rt_error(self): Calculates the 3*sigma error of the feature’s retention time.summarize(self, ph=True, pa=True, ta=True, g_score=True, n_score=True, a_score=True): Summarizes the feature by calculating summary statistics.subset(self, start, end, summarize=True): Keeps a subset of the feature based on start and end positions.
Functions
detect_features
detect_features(d)
Detects features in the MS data.
Parameters:
d(MSData object): An object that contains the MS data.
Returns:
final_features(list ofFeatureobjects): A list of detected features.
segment_feature
segment_feature(feature, sigma=1.2, prominence_ratio=0.05, distance=10, peak_height_tol=1000, length_tol=5, sse_tol=0.5)
Segments a feature into multiple features based on edge detection.
Parameters:
feature(Featureobject): The feature to segment.sigma(float): The sigma value for the Gaussian filter. DFault is 1.2.prominence_ratio(float): The prominence ratio for finding peaks. Default is 0.05.distance(int): The minimum distance between peaks. Default is 10.peak_height_tol(float): The peak height tolerance for segmentation.length_tol(int): The length tolerance for segmentation. Default is 5.sse_tol(float): The squared error tolerance for segmentation. Default is 0.5.
Returns:
segmented_features(list ofFeatureobjects): A list of segmented features.
Please see the source code for more internal functions.