feature_detection
Overview
Untargeted feature detection.
Classes
Feature
A class to store a feature characterized by a unique pair of m/z and retention time.
Attributes:
rt_seq
(list of float): Retention time sequence.signals
(list): Signal sequence organized as [[m/z, intensity], …].scan_idx_seq
(list of int): Scan index sequence.ms2_seq
(list): MS2 spectra.gap_counter
(int): Counter for the number of consecutive zeros in the peak tail.id
(int): Feature ID.feature_group_id
(int): Peak group ID.mz
(float): m/z value.rt
(float): Retention time.scan_idx
(int): Scan index of the peak apex.peak_height
(float): Peak height.peak_area
(float): Peak area.top_average
(float): Average of the highest three intensities.ms2
(object): Representative MS2 spectrum.length
(int): Number of valid scans in the feature.gaussian_similarity
(float): Gaussian similarity.noise_score
(float): Noise score.asymmetry_factor
(float): Asymmetry factor.sse
(float): Squared error to the smoothed curve.is_segmented
(bool): Indicates if the feature is segmented.is_isotope
(bool): Indicates if the feature is an isotope.charge_state
(int): Charge state of the feature.isotope_signals
(list): Isotope signals [[m/z, intensity], …].is_in_source_fragment
(bool): Indicates if the feature is an in-source fragment.adduct_type
(str): Adduct type.annotation_algorithm
(str): Annotation algorithm.search_mode
(str): Search mode (‘identity search’, ‘fuzzy search’, or ‘mzrt_search’).similarity
(float): Similarity score (0-1).annotation
(str): Name of annotated compound.formula
(str): Molecular formula.matched_peak_number
(int): Number of matched peaks.smiles
(str): SMILES notation.inchikey
(str): InChIKey notation.matched_precursor_mz
(float): Matched precursor m/z.matched_ms2
(object): Matched MS2 spectra.matched_adduct_type
(str): Matched adduct type.
Methods:
extend(self, rt, signal, scan_idx)
: Extends the chromatographic peak with new data points.get_mz_error(self)
: Calculates the 3*sigma error of the feature’s m/z.get_rt_error(self)
: Calculates the 3*sigma error of the feature’s retention time.summarize(self, ph=True, pa=True, ta=True, g_score=True, n_score=True, a_score=True)
: Summarizes the feature by calculating summary statistics.subset(self, start, end, summarize=True)
: Keeps a subset of the feature based on start and end positions.
Functions
detect_features
detect_features(d)
Detects features in the MS data.
Parameters:
d
(MSData object): An object that contains the MS data.
Returns:
final_features
(list ofFeature
objects): A list of detected features.
segment_feature
segment_feature(feature, sigma=1.2, prominence_ratio=0.05, distance=10, peak_height_tol=1000, length_tol=5, sse_tol=0.5)
Segments a feature into multiple features based on edge detection.
Parameters:
feature
(Feature
object): The feature to segment.sigma
(float): The sigma value for the Gaussian filter. DFault is 1.2.prominence_ratio
(float): The prominence ratio for finding peaks. Default is 0.05.distance
(int): The minimum distance between peaks. Default is 10.peak_height_tol
(float): The peak height tolerance for segmentation.length_tol
(int): The length tolerance for segmentation. Default is 5.sse_tol
(float): The squared error tolerance for segmentation. Default is 0.5.
Returns:
segmented_features
(list ofFeature
objects): A list of segmented features.
Please see the source code for more internal functions.