peptdeep.model.ms2¶

Classes:

`IntenAwareLoss`([base_weight])	Loss weighted by intensity for MS2 models
`ModelMS2Bert`(charged_frag_types[, dropout, ...])	Using HuggingFace's BertEncoder for MS2 prediction
`ModelMS2Transformer`(num_frag_types[, ...])	Transformer model for MS2 prediction
`ModelMS2pDeep`(num_frag_types[, ...])	LSTM model for MS2 prediction similar to pDeep series
`pDeepModel`([charged_frag_types, dropout])	ModelInterface for MS2 prediction models

Functions:

`add_cutoff_metric`(metrics_describ, metrics_df)
`calc_ms2_similarity`(psm_df, ...[, ...])
`charged_frags_to_tensor`(charged_frags)	Convert a list of strings (charged fragment types, modloss fragment types) to a tensor
`normalize_fragment_intensities`(psm_df, ...)	Normalize the intensities to 0-1 values inplace
`pearson`(x, y)	Compute pearson correlation between 2 batches of 1-D tensors
`pearson_correlation`(x, y)	Compute pearson correlation between 2 batches of 1-D tensors
`spearman`(x, y, device)	Compute spearman correlation between 2 batches of 1-D tensors
`spearman_correlation`(x, y, device)	Compute spearman correlation between 2 batches of 1-D tensors
`spectral_angle`(cos)
`tensor_to_charged_frags`(tensor)	Convert a tensor to a list of strings (charged fragment types, modloss fragment types)

class peptdeep.model.ms2.IntenAwareLoss(base_weight=0.2)[source][source]¶

Bases: Module

Loss weighted by intensity for MS2 models

Methods:

`__init__`([base_weight])	Initialize internal Module state, shared by both nn.Module and ScriptModule.
`forward`(pred, target)	Define the computation performed at every call.

__init__(base_weight=0.2)[source][source]¶: Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(pred, target)[source][source]¶

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class peptdeep.model.ms2.ModelMS2Bert(charged_frag_types, dropout=0.1, nlayers=4, hidden=256, output_attentions=False, **kwargs)[source][source]¶

Bases: Module

Using HuggingFace’s BertEncoder for MS2 prediction

Methods:

`__init__`(charged_frag_types[, dropout, ...])	Initialize internal Module state, shared by both nn.Module and ScriptModule.
`forward`(aa_indices, mod_x, charges, NCEs, ...)	Define the computation performed at every call.

Attributes:

`output_attentions`
`supported_charged_frag_types`

__init__(charged_frag_types, dropout=0.1, nlayers=4, hidden=256, output_attentions=False, **kwargs)[source][source]¶: Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(aa_indices, mod_x, charges: Tensor, NCEs: Tensor, instrument_indices)[source][source]¶

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

property output_attentions¶

property supported_charged_frag_types¶

class peptdeep.model.ms2.ModelMS2Transformer(num_frag_types: int, num_modloss_types: int = 0, mask_modloss: bool = True, dropout: float = 0.1, nlayers: int = 4, hidden: int = 256, **kwargs)[source][source]¶

Bases: Module

Transformer model for MS2 prediction

Parameters:

num_frag_types (int) – Total number of fragment types of a fragmentation position to predict
num_modloss_types (int, optional) – Number of fragment types of a fragmentation position to predict, by default 0
mask_modloss (bool, optional) – If True, the modloss layer will be disabled, by default True
dropout (float, optional) – Dropout, by default 0.1
nlayers (int, optional) – Number of transformer layer, by default 4
hidden (int, optional) – Hidden layer size, by default 256

Methods:

`__init__`(num_frag_types[, ...])	Initialize internal Module state, shared by both nn.Module and ScriptModule.
`forward`(aa_indices, mod_x, charges, NCEs, ...)	Define the computation performed at every call.

__init__(num_frag_types: int, num_modloss_types: int = 0, mask_modloss: bool = True, dropout: float = 0.1, nlayers: int = 4, hidden: int = 256, **kwargs)[source][source]¶: Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(aa_indices, mod_x, charges: Tensor, NCEs: Tensor, instrument_indices)[source][source]¶

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

class peptdeep.model.ms2.ModelMS2pDeep(num_frag_types, num_modloss_types=0, mask_modloss=True, dropout=0.1, **kwargs)[source][source]¶

Bases: Module

LSTM model for MS2 prediction similar to pDeep series

Methods:

`__init__`(num_frag_types[, ...])	Initialize internal Module state, shared by both nn.Module and ScriptModule.
`forward`(aa_indices, mod_x, charges, NCEs, ...)	Define the computation performed at every call.

__init__(num_frag_types, num_modloss_types=0, mask_modloss=True, dropout=0.1, **kwargs)[source][source]¶: Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(aa_indices, mod_x, charges: Tensor, NCEs: Tensor, instrument_indices)[source][source]¶

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

peptdeep.model.ms2.add_cutoff_metric(metrics_describ, metrics_df, thres=0.9)[source][source]¶

peptdeep.model.ms2.calc_ms2_similarity(psm_df: DataFrame, predict_intensity_df: DataFrame, fragment_intensity_df: DataFrame, charged_frag_types: List = None, metrics=['PCC', 'COS', 'SA', 'SPC'], GPU=True, batch_size=10240, verbose=False, spc_top_k=0) → Tuple[DataFrame, DataFrame][source][source]¶

peptdeep.model.ms2.charged_frags_to_tensor(charged_frags: List[str]) → Tensor[source][source]¶

Convert a list of strings (charged fragment types, modloss fragment types) to a tensor

Parameters:: list (List[str]) – List of strings

peptdeep.model.ms2.normalize_fragment_intensities(psm_df: DataFrame, frag_intensity_df: DataFrame)[source][source]¶

Normalize the intensities to 0-1 values inplace

Parameters:

psm_df (pd.DataFrame) – PSM DataFrame
frag_intensity_df (pd.DataFrame) – Fragment intensity DataFrame to be normalized. Intensities will be normalzied inplace.

class peptdeep.model.ms2.pDeepModel(charged_frag_types=['b_z1', 'b_z2', 'y_z1', 'y_z2', 'b_modloss_z1', 'b_modloss_z2', 'y_modloss_z1', 'y_modloss_z2'], dropout=0.1, model_class: ~torch.nn.modules.module.Module = <class 'peptdeep.model.ms2.ModelMS2Bert'>, device: str = 'gpu', mask_modloss: bool | None = None, override_from_weights: bool = False, **kwargs)[source][source]¶

Bases: ModelInterface

ModelInterface for MS2 prediction models

Parameters:

charged_frag_types (List[str]) – Charged fragment types to predict
dropout (float, optional) – Dropout rate, by default 0.1
model_class (torch.nn.Module, optional) – Ms2 Model class, by default ModelMS2Bert
device (str, optional) – Device to run the model, by default “gpu”
override_from_weights (bool, optional default False) – Override the requested charged frag types from the model weights on loading. This allows to predict all fragment types supported by the weights even if the user doesn’t know what fragments types are supported by the weights. Thereby, the model will always be in a safe to predict state.
mask_modloss (bool, optional (deprecated)) – Mask the modloss fragments, this is deprecated and will be removed in the future. To mask the modloss fragments, the charged_frag_types should not include the modloss fragments.

Methods:

`__init__`([charged_frag_types, dropout])
`bootstrap_nce_search`(psm_df, ...[, ...])
`grid_nce_search`(psm_df, fragment_intensity_df)
`predict`(precursor_df, *[, batch_size, ...])	Predict MS2 fragment intensities
`predict_mp`(**kwargs)	Predicting with multiprocessing is no GPUs are availible.
`test`(precursor_df, fragment_intensity_df[, ...])
`train`(precursor_df, fragment_intensity_df, *)	Train the model according to specifications.
`train_with_warmup`(precursor_df, ...[, ...])	Train the model according to specifications.

__init__(charged_frag_types=['b_z1', 'b_z2', 'y_z1', 'y_z2', 'b_modloss_z1', 'b_modloss_z2', 'y_modloss_z1', 'y_modloss_z2'], dropout=0.1, model_class: ~torch.nn.modules.module.Module = <class 'peptdeep.model.ms2.ModelMS2Bert'>, device: str = 'gpu', mask_modloss: bool | None = None, override_from_weights: bool = False, **kwargs)[source][source]¶

Parameters:

device (str, optional) – device type in ‘get_available’, ‘cpu’, ‘mps’, ‘gpu’ (or ‘cuda’), by default ‘gpu’
fixed_sequence_len (int, optional) – See fixed_sequence_len, defaults to 0.
min_pred_value (float, optional) – See min_pred_value, defaults to 0.0.

bootstrap_nce_search(psm_df: DataFrame, fragment_intensity_df: DataFrame, nce_first=15, nce_last=45, nce_step=3, instrument='Lumos', charged_frag_types: List = None, metric='PCC>0.9', max_psm_subset=3000, n_bootstrap=3, callback=None)[source][source]¶

grid_nce_search(psm_df: DataFrame, fragment_intensity_df: DataFrame, nce_first=15, nce_last=45, nce_step=3, search_instruments=['Lumos'], charged_frag_types: List = None, metric='PCC>0.9', max_psm_subset=1000000, callback=None)[source][source]¶

predict(precursor_df: DataFrame, *, batch_size=1024, verbose=False, reference_frag_df=None, allow_unsafe_predictions=False, **kwargs) → DataFrame[source][source]¶

Predict MS2 fragment intensities

Parameters:

precursor_df (pd.DataFrame) – Precursor DataFrame
batch_size (int, optional) – Batch size, by default 1024
verbose (bool, optional) – Verbose, by default False
reference_frag_df (pd.DataFrame, optional) – Reference fragment intensity DataFrame, by default None
allow_unsafe_predictions (bool, optional) – Allow a newly a randomly initialized model to be used for prediction, by default False

Returns:

Predicted fragment intensities

Return type:

pd.DataFrame

predict_mp(**kwargs) → DataFrame[source][source]¶: Predicting with multiprocessing is no GPUs are availible. Note this multiprocessing method only works for models those predict values within (inplace of) the precursor_df.

test(precursor_df: DataFrame, fragment_intensity_df: DataFrame, default_instrument: str = 'Lumos', default_nce: float = 30.0) → DataFrame[source][source]¶

train(precursor_df: DataFrame, fragment_intensity_df, *, batch_size=1024, epoch=20, warmup_epoch=0, lr=1e-05, verbose=False, verbose_each_epoch=False, **kwargs)[source][source]¶: Train the model according to specifications.

train_with_warmup(precursor_df: DataFrame, fragment_intensity_df, *, batch_size=1024, epoch=10, warmup_epoch=5, lr=1e-05, verbose=False, verbose_each_epoch=False, **kwargs)[source][source]¶: Train the model according to specifications. Includes a warumup phase with linear increasing and cosine decreasing for lr scheduling).

peptdeep.model.ms2.pearson(x: Tensor, y: Tensor)[source]¶

Compute pearson correlation between 2 batches of 1-D tensors

Parameters:

x (torch.Tensor) – Shape (Batch, n)
y (torch.Tensor) – Shape (Batch, n)

peptdeep.model.ms2.pearson_correlation(x: Tensor, y: Tensor)[source][source]¶

Compute pearson correlation between 2 batches of 1-D tensors

Parameters:

x (torch.Tensor) – Shape (Batch, n)
y (torch.Tensor) – Shape (Batch, n)

peptdeep.model.ms2.spearman(x: Tensor, y: Tensor, device)[source]¶

Compute spearman correlation between 2 batches of 1-D tensors

Parameters:

x (torch.Tensor) – Shape (Batch, n)
y (torch.Tensor) – Shape (Batch, n)

peptdeep.model.ms2.spearman_correlation(x: Tensor, y: Tensor, device)[source][source]¶

Compute spearman correlation between 2 batches of 1-D tensors

Parameters:

x (torch.Tensor) – Shape (Batch, n)
y (torch.Tensor) – Shape (Batch, n)

peptdeep.model.ms2.spectral_angle(cos)[source][source]¶

peptdeep.model.ms2.tensor_to_charged_frags(tensor: Tensor) → List[str][source][source]¶

Convert a tensor to a list of strings (charged fragment types, modloss fragment types)

Parameters:: tensor (torch.Tensor) – Tensor of int32