peptdeep.mass_spec.match¶
Classes:
|
Main entry for peptide-spectrum matching. |
Functions:
|
Matching query masses against sorted MS2/spec centroid masses, only closest (minimal abs mass error) peaks are returned. |
|
Matching query masses against sorted MS2/spec profile masses, both first and last m/z values are returned. |
|
Internel function to match fragment mz values to spectrum mz values. |
|
Matching query masses against sorted MS2/spec profile masses, only highest peaks are returned. |
- class peptdeep.mass_spec.match.PepSpecMatch(charged_frag_types=['b_z1', 'b_z2', 'y_z1', 'y_z2', 'b_modloss_z1', 'b_modloss_z2', 'y_modloss_z1', 'y_modloss_z2'])[source][source]¶
Bases:
objectMain entry for peptide-spectrum matching.
TODO: figure out relation with peptdeep.match.psm_match.PepSpecMatch.
Methods:
__init__([charged_frag_types])get_fragment_mz_df(psm_df)match_ms2_centroid(psm_df, ms2_file_dict[, ...])Matching PSM dataframe against the ms2 files in ms2_file_dict This method will store matched values as attributes: - self.psm_df - self.fragment_mz_df - self.matched_intensity_df - self.matched_mz_err_df
match_ms2_one_raw(psm_df_one_raw, ms2_file)Matching psm_df_one_raw against ms2_file
- __init__(charged_frag_types=['b_z1', 'b_z2', 'y_z1', 'y_z2', 'b_modloss_z1', 'b_modloss_z2', 'y_modloss_z1', 'y_modloss_z2'])[source][source]¶
- match_ms2_centroid(psm_df: DataFrame, ms2_file_dict: dict, ms2_file_type: str = 'alphapept', ppm=True, tol=20.0)[source][source]¶
Matching PSM dataframe against the ms2 files in ms2_file_dict This method will store matched values as attributes: - self.psm_df - self.fragment_mz_df - self.matched_intensity_df - self.matched_mz_err_df
- Parameters:
psm_df (pd.DataFrame) – PSM dataframe
ms2_file_dict (dict) – {raw_name: ms2 path}
ms2_file_type (str, optional) – Could be ‘alphapept’, ‘mgf’ or ‘thermo’. Defaults to ‘alphapept’.
ppm (bool, optional) – Defaults to True.
tol (float, optional) – PPM units, defaults to 20.0.
- match_ms2_one_raw(psm_df_one_raw: DataFrame, ms2_file: str, ms2_file_type: str = 'alphapept', ppm: bool = True, tol: float = 20.0) tuple[source][source]¶
Matching psm_df_one_raw against ms2_file
- Parameters:
psm_df_one_raw (pd.DataFrame) – psm dataframe that contains only one raw file
ms2_file (str) – ms2 file path
ms2_file_type (str, optional) – ms2 file type, could be [“thermo”,”alphapept”,”mgf”]. Default to ‘alphapept’
ppm (bool, optional) – if use ppm tolerance. Defaults to True.
tol (float, optional) – tolerance value. Defaults to 20.0.
- Returns:
pd.DataFrame: psm dataframe with fragment index information.
pd.DataFrame: fragment mz dataframe.
pd.DataFrame: matched intensity dataframe.
pd.DataFrame: matched mass error dataframe. np.inf if a fragment is not matched.
- Return type:
tuple
- peptdeep.mass_spec.match.match_centroid_mz(spec_mzs: ndarray, query_mzs: ndarray, spec_mz_tols: ndarray) ndarray[source]¶
Matching query masses against sorted MS2/spec centroid masses, only closest (minimal abs mass error) peaks are returned.
- Parameters:
spec_mzs (np.ndarray) – MS2 or spec mz values, 1-D float array
query_mzs (np.ndarray) – query mz values, n-D float array
spec_mz_tols (np.ndarray) – Da tolerance array, same shape as spec_mzs
- Returns:
np.ndarray of int32, the shape is the same as query_mzs. -1 means no peaks are matched for the query mz
- Return type:
np.ndarray
- peptdeep.mass_spec.match.match_first_last_profile_mz(spec_mzs: ndarray, query_mzs: ndarray, spec_mz_tols: ndarray) ndarray[source]¶
Matching query masses against sorted MS2/spec profile masses, both first and last m/z values are returned.
- Parameters:
spec_mzs (np.ndarray) – MS2 or spec mz values, 1-D float array
query_mzs (np.ndarray) – query mz values, n-D float array
spec_mz_tols (np.ndarray) – Da tolerance array, same shape as spec_mzs
- Returns:
2D np.ndarray of int32 with first and last matched index for the query mz. The shape is the same as (len(query_mzs),2). -1 means no peaks are matched for the query mz
- Return type:
np.ndarray
- peptdeep.mass_spec.match.match_one_raw_with_numba(spec_idxes, frag_start_idxes, frag_stop_idxes, all_frag_mzs, all_spec_mzs, all_spec_intensities, peak_start_idxes, peak_end_idxes, matched_intensities, matched_mz_errs, ppm, tol)[source]¶
Internel function to match fragment mz values to spectrum mz values. Matched_mz_errs[i] = np.inf if no peaks are matched.
- peptdeep.mass_spec.match.match_profile_mz(spec_mzs: ndarray, query_mzs: ndarray, spec_mz_tols: ndarray, spec_intens: ndarray) ndarray[source]¶
Matching query masses against sorted MS2/spec profile masses, only highest peaks are returned.
- Parameters:
spec_mzs (np.ndarray) – MS2 or spec mz values, 1-D float array
query_mzs (np.ndarray) – query mz values, n-D float array
spec_mz_tols (np.ndarray) – Da tolerance array, same shape as spec_mzs
- Returns:
np.ndarray of int32, the shape is the same as query_mzs. -1 means no peaks are matched for the query mz
- Return type:
np.ndarray