{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Tutorial: using `ModelManager`" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from peptdeep.pretrained_models import ModelManager" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`ModelManager` is the main entry to access MS2/RT/CCS models." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "model_mgr = ModelManager(mask_modloss=True, device='cpu')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Most of the default parameters and attributes of `ModelManager` class are controlled by `peptdeep.settings.global_settings` which is a dict.\n", "\n", "```\n", "from peptdeep.settings import global_settings\n", "```\n", "\n", "The default values of `peptdeep.settings.global_settings` is defined in [default_settings.yaml](../peptdeep/constants/default_settings.yaml)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### `ModelManager.load_installed_models`\n", "\n", "`ModelManager.load_installed_models(model_type)` enables users to load different model types. The `model_type` could be: \n", "- generic: generic RT/CCS/MS2 models including HLA\n", "- HLA: currently the same as `generic`\n", "- phos: RT/CCS/MS2 models for Phospho@S/T/Y\n", "- digly: RT/CCS/MS2 models for GlyGly@K\n", "\n", "Calling `ModelManager(...)` will also call `ModelManager.load_installed_models` implicitly, and the default model_type is `global_settings['model_mgr']['model_type']`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Test the RT model\n", "\n", "Use the 11 iRT peptides to test the RT model" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from peptdeep.model.rt import IRT_PEPTIDE_DF" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sequencepep_nameirtmodsmod_sitesnAA
0LGGNEQVTRRT-pep a-24.929
1GAGSSEPVTGLDAKRT-pep b0.00Phospho@S514
2VEATFGVDESNAKRT-pep c12.3913
3YILAGVENSKRT-pep d19.7910
4TPVISGGPYEYRRT-pep e28.7112
5TPVITGAPYEYRRT-pep f33.3812
6DGLDAASYYAPVRRT-pep g42.2613
7ADVTPADFSEWSKRT-pep h54.6213
8GTFIIDPGGVIRRT-pep i70.5212
9GTFIIDPAAVIRRT-pep k87.2312
10LFLQFGAQGSPFLKRT-pep l100.0014
\n", "
" ], "text/plain": [ " sequence pep_name irt mods mod_sites nAA\n", "0 LGGNEQVTR RT-pep a -24.92 9\n", "1 GAGSSEPVTGLDAK RT-pep b 0.00 Phospho@S 5 14\n", "2 VEATFGVDESNAK RT-pep c 12.39 13\n", "3 YILAGVENSK RT-pep d 19.79 10\n", "4 TPVISGGPYEYR RT-pep e 28.71 12\n", "5 TPVITGAPYEYR RT-pep f 33.38 12\n", "6 DGLDAASYYAPVR RT-pep g 42.26 13\n", "7 ADVTPADFSEWSK RT-pep h 54.62 13\n", "8 GTFIIDPGGVIR RT-pep i 70.52 12\n", "9 GTFIIDPAAVIR RT-pep k 87.23 12\n", "10 LFLQFGAQGSPFLK RT-pep l 100.00 14" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = IRT_PEPTIDE_DF.copy()\n", "# randomly add some modifications, this may change the real irt\n", "df.loc[1,'mods'] = 'Phospho@S'\n", "df.loc[1,'mod_sites'] = '5'\n", "df" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2022-09-09 21:54:02> Predicting RT ...\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "100%|██████████| 5/5 [00:00<00:00, 125.27it/s]\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sequencepep_nameirtmodsmod_sitesnAArt_predrt_norm_predirt_pred
0LGGNEQVTRRT-pep a-24.9290.1842350.184235-26.123537
1GAGSSEPVTGLDAKRT-pep b0.00Phospho@S5140.2667460.26674611.916059
2VEATFGVDESNAKRT-pep c12.39130.2661330.26613311.633120
3YILAGVENSKRT-pep d19.79100.2904950.29049522.864811
4TPVISGGPYEYRRT-pep e28.71120.3038470.30384729.020259
5TPVITGAPYEYRRT-pep f33.38120.3165140.31651434.860122
6DGLDAASYYAPVRRT-pep g42.26130.3244230.32442338.506308
7ADVTPADFSEWSKRT-pep h54.62130.3451970.34519748.083890
8GTFIIDPGGVIRRT-pep i70.52120.3942480.39424870.697474
9GTFIIDPAAVIRRT-pep k87.23120.4347750.43477589.381150
10LFLQFGAQGSPFLKRT-pep l100.00140.4595830.459583100.818303
\n", "
" ], "text/plain": [ " sequence pep_name irt mods mod_sites nAA rt_pred \\\n", "0 LGGNEQVTR RT-pep a -24.92 9 0.184235 \n", "1 GAGSSEPVTGLDAK RT-pep b 0.00 Phospho@S 5 14 0.266746 \n", "2 VEATFGVDESNAK RT-pep c 12.39 13 0.266133 \n", "3 YILAGVENSK RT-pep d 19.79 10 0.290495 \n", "4 TPVISGGPYEYR RT-pep e 28.71 12 0.303847 \n", "5 TPVITGAPYEYR RT-pep f 33.38 12 0.316514 \n", "6 DGLDAASYYAPVR RT-pep g 42.26 13 0.324423 \n", "7 ADVTPADFSEWSK RT-pep h 54.62 13 0.345197 \n", "8 GTFIIDPGGVIR RT-pep i 70.52 12 0.394248 \n", "9 GTFIIDPAAVIR RT-pep k 87.23 12 0.434775 \n", "10 LFLQFGAQGSPFLK RT-pep l 100.00 14 0.459583 \n", "\n", " rt_norm_pred irt_pred \n", "0 0.184235 -26.123537 \n", "1 0.266746 11.916059 \n", "2 0.266133 11.633120 \n", "3 0.290495 22.864811 \n", "4 0.303847 29.020259 \n", "5 0.316514 34.860122 \n", "6 0.324423 38.506308 \n", "7 0.345197 48.083890 \n", "8 0.394248 70.697474 \n", "9 0.434775 89.381150 \n", "10 0.459583 100.818303 " ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model_mgr.load_installed_models('phos')\n", "model_mgr.predict_rt(df)\n", "model_mgr.rt_model.add_irt_column_to_precursor_df(df)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Training RT model on df with the `rt_norm` column:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2022-09-09 21:54:02> 11 PSMs for RT training/fine-tuning\n", "2022-09-09 21:54:09> Predicting RT ...\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "100%|██████████| 5/5 [00:00<00:00, 151.56it/s]\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sequencepep_nameirtmodsmod_sitesnAArt_predrt_norm_predirt_predrt_norm
0LGGNEQVTRRT-pep a-24.9290.1271890.127189-18.9164070.000000
1GAGSSEPVTGLDAKRT-pep b0.00Phospho@S5140.1999190.199919-5.5042720.199488
2VEATFGVDESNAKRT-pep c12.39130.2952370.29523712.0731410.298671
3YILAGVENSKRT-pep d19.79100.3573510.35735123.5273890.357909
4TPVISGGPYEYRRT-pep e28.71120.4297620.42976236.8805960.429315
5TPVITGAPYEYRRT-pep f33.38120.3924190.39241929.9942430.466699
6DGLDAASYYAPVRRT-pep g42.26130.3873930.38739329.0675020.537784
7ADVTPADFSEWSKRT-pep h54.62130.6344850.63448574.6334020.636728
8GTFIIDPGGVIRRT-pep i70.52120.6713100.67131081.4241230.764009
9GTFIIDPAAVIRRT-pep k87.23120.6993340.69933486.5920330.897775
10LFLQFGAQGSPFLKRT-pep l100.00140.6074420.60744269.6463371.000000
\n", "
" ], "text/plain": [ " sequence pep_name irt mods mod_sites nAA rt_pred \\\n", "0 LGGNEQVTR RT-pep a -24.92 9 0.127189 \n", "1 GAGSSEPVTGLDAK RT-pep b 0.00 Phospho@S 5 14 0.199919 \n", "2 VEATFGVDESNAK RT-pep c 12.39 13 0.295237 \n", "3 YILAGVENSK RT-pep d 19.79 10 0.357351 \n", "4 TPVISGGPYEYR RT-pep e 28.71 12 0.429762 \n", "5 TPVITGAPYEYR RT-pep f 33.38 12 0.392419 \n", "6 DGLDAASYYAPVR RT-pep g 42.26 13 0.387393 \n", "7 ADVTPADFSEWSK RT-pep h 54.62 13 0.634485 \n", "8 GTFIIDPGGVIR RT-pep i 70.52 12 0.671310 \n", "9 GTFIIDPAAVIR RT-pep k 87.23 12 0.699334 \n", "10 LFLQFGAQGSPFLK RT-pep l 100.00 14 0.607442 \n", "\n", " rt_norm_pred irt_pred rt_norm \n", "0 0.127189 -18.916407 0.000000 \n", "1 0.199919 -5.504272 0.199488 \n", "2 0.295237 12.073141 0.298671 \n", "3 0.357351 23.527389 0.357909 \n", "4 0.429762 36.880596 0.429315 \n", "5 0.392419 29.994243 0.466699 \n", "6 0.387393 29.067502 0.537784 \n", "7 0.634485 74.633402 0.636728 \n", "8 0.671310 81.424123 0.764009 \n", "9 0.699334 86.592033 0.897775 \n", "10 0.607442 69.646337 1.000000 " ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def normalize_irt(df):\n", " min_rt = df.irt.min()\n", " df['rt_norm'] = (\n", " df.irt - min_rt\n", " ) / (df.irt.max()-min_rt)\n", "normalize_irt(df)\n", "model_mgr.epoch_to_train_rt_ccs=50\n", "model_mgr.train_rt_model(df)\n", "model_mgr.predict_rt(df)\n", "model_mgr.rt_model.add_irt_column_to_precursor_df(df)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Test the CCS model" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2022-09-09 21:54:09> Predicting mobility ...\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "100%|██████████| 5/5 [00:00<00:00, 117.53it/s]\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sequencepep_nameirtmodsmod_sitesnAArt_predrt_norm_predirt_predrt_normchargeccs_predprecursor_mzmobility_pred
0LGGNEQVTRRT-pep a-24.9290.1271890.127189-18.9164070.0000002331.279816487.2567050.815533
1GAGSSEPVTGLDAKRT-pep b0.00Phospho@S5140.1999190.199919-5.5042720.1994882381.067841684.8057720.941902
2VEATFGVDESNAKRT-pep c12.39130.2952370.29523712.0731410.2986712394.208893683.8278890.974369
3YILAGVENSKRT-pep d19.79100.3573510.35735123.5273890.3579092364.828003547.2980390.899500
4TPVISGGPYEYRRT-pep e28.71120.4297620.42976236.8805960.4293152394.317596669.8380590.974434
5TPVITGAPYEYRRT-pep f33.38120.3924190.39241929.9942430.4666992399.848633683.8537090.988309
6DGLDAASYYAPVRRT-pep g42.26130.3873930.38739329.0675020.5377842399.736542699.3384230.988252
7ADVTPADFSEWSKRT-pep h54.62130.6344850.63448574.6334020.6367282405.532562726.8357141.002953
8GTFIIDPGGVIRRT-pep i70.52120.6713100.67131081.4241230.7640092379.443451622.8535120.936954
9GTFIIDPAAVIRRT-pep k87.23120.6993340.69933486.5920330.8977752387.886780636.8691630.958034
10LFLQFGAQGSPFLKRT-pep l100.00140.6074420.60744269.6463371.0000002435.544861776.9297511.077836
\n", "
" ], "text/plain": [ " sequence pep_name irt mods mod_sites nAA rt_pred \\\n", "0 LGGNEQVTR RT-pep a -24.92 9 0.127189 \n", "1 GAGSSEPVTGLDAK RT-pep b 0.00 Phospho@S 5 14 0.199919 \n", "2 VEATFGVDESNAK RT-pep c 12.39 13 0.295237 \n", "3 YILAGVENSK RT-pep d 19.79 10 0.357351 \n", "4 TPVISGGPYEYR RT-pep e 28.71 12 0.429762 \n", "5 TPVITGAPYEYR RT-pep f 33.38 12 0.392419 \n", "6 DGLDAASYYAPVR RT-pep g 42.26 13 0.387393 \n", "7 ADVTPADFSEWSK RT-pep h 54.62 13 0.634485 \n", "8 GTFIIDPGGVIR RT-pep i 70.52 12 0.671310 \n", "9 GTFIIDPAAVIR RT-pep k 87.23 12 0.699334 \n", "10 LFLQFGAQGSPFLK RT-pep l 100.00 14 0.607442 \n", "\n", " rt_norm_pred irt_pred rt_norm charge ccs_pred precursor_mz \\\n", "0 0.127189 -18.916407 0.000000 2 331.279816 487.256705 \n", "1 0.199919 -5.504272 0.199488 2 381.067841 684.805772 \n", "2 0.295237 12.073141 0.298671 2 394.208893 683.827889 \n", "3 0.357351 23.527389 0.357909 2 364.828003 547.298039 \n", "4 0.429762 36.880596 0.429315 2 394.317596 669.838059 \n", "5 0.392419 29.994243 0.466699 2 399.848633 683.853709 \n", "6 0.387393 29.067502 0.537784 2 399.736542 699.338423 \n", "7 0.634485 74.633402 0.636728 2 405.532562 726.835714 \n", "8 0.671310 81.424123 0.764009 2 379.443451 622.853512 \n", "9 0.699334 86.592033 0.897775 2 387.886780 636.869163 \n", "10 0.607442 69.646337 1.000000 2 435.544861 776.929751 \n", "\n", " mobility_pred \n", "0 0.815533 \n", "1 0.941902 \n", "2 0.974369 \n", "3 0.899500 \n", "4 0.974434 \n", "5 0.988309 \n", "6 0.988252 \n", "7 1.002953 \n", "8 0.936954 \n", "9 0.958034 \n", "10 1.077836 " ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df['charge'] = 2\n", "model_mgr.predict_mobility(df)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Test the MS2 model" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2022-09-09 21:54:10> Predicting MS2 ...\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "100%|██████████| 5/5 [00:00<00:00, 82.83it/s]\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
b_z1b_z2y_z1y_z2b_modloss_z1b_modloss_z2y_modloss_z1y_modloss_z2
00.0000000.01.0000000.0217270.00.00.00.0
10.1916130.00.3439920.0000000.00.00.00.0
20.0638250.00.1199380.0152000.00.00.00.0
30.0334200.00.2570220.0000000.00.00.00.0
40.0273110.00.3400530.0000000.00.00.00.0
...........................
1180.0000000.00.1014130.0000000.00.00.00.0
1190.0000000.00.6724980.0000000.00.00.00.0
1200.0000000.00.0344370.0000000.00.00.00.0
1210.0000000.00.1254300.0000000.00.00.00.0
1220.0000000.00.1123380.0000000.00.00.00.0
\n", "

123 rows × 8 columns

\n", "
" ], "text/plain": [ " b_z1 b_z2 y_z1 y_z2 b_modloss_z1 b_modloss_z2 \\\n", "0 0.000000 0.0 1.000000 0.021727 0.0 0.0 \n", "1 0.191613 0.0 0.343992 0.000000 0.0 0.0 \n", "2 0.063825 0.0 0.119938 0.015200 0.0 0.0 \n", "3 0.033420 0.0 0.257022 0.000000 0.0 0.0 \n", "4 0.027311 0.0 0.340053 0.000000 0.0 0.0 \n", ".. ... ... ... ... ... ... \n", "118 0.000000 0.0 0.101413 0.000000 0.0 0.0 \n", "119 0.000000 0.0 0.672498 0.000000 0.0 0.0 \n", "120 0.000000 0.0 0.034437 0.000000 0.0 0.0 \n", "121 0.000000 0.0 0.125430 0.000000 0.0 0.0 \n", "122 0.000000 0.0 0.112338 0.000000 0.0 0.0 \n", "\n", " y_modloss_z1 y_modloss_z2 \n", "0 0.0 0.0 \n", "1 0.0 0.0 \n", "2 0.0 0.0 \n", "3 0.0 0.0 \n", "4 0.0 0.0 \n", ".. ... ... \n", "118 0.0 0.0 \n", "119 0.0 0.0 \n", "120 0.0 0.0 \n", "121 0.0 0.0 \n", "122 0.0 0.0 \n", "\n", "[123 rows x 8 columns]" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df['charge'] = 2\n", "inten_df = model_mgr.predict_ms2(df)\n", "inten_df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that modloss fragment intensities are enabled in this case (`ModelManager(mask_modloss=False, ...)`), so modloss intensities are not zero for Phosphopeptides:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
b_z1b_z2y_z1y_z2b_modloss_z1b_modloss_z2y_modloss_z1y_modloss_z2
80.0000000.00.0000000.0000000.00.00.00.0
90.0638350.00.0128350.0006060.00.00.00.0
100.0661770.00.0000000.0000000.00.00.00.0
110.0611810.00.0649210.0000000.00.00.00.0
120.0000000.00.0826990.0000000.00.00.00.0
130.0000000.01.0000000.0801080.00.00.00.0
140.0000000.00.0685870.0000000.00.00.00.0
150.0000000.00.2931110.0000000.00.00.00.0
160.0000000.00.1859960.0000000.00.00.00.0
170.0000000.00.0244860.0000000.00.00.00.0
180.0000000.00.1058640.0000000.00.00.00.0
190.0000000.00.1483010.0000000.00.00.00.0
200.0000000.00.0466930.0000000.00.00.00.0
\n", "
" ], "text/plain": [ " b_z1 b_z2 y_z1 y_z2 b_modloss_z1 b_modloss_z2 \\\n", "8 0.000000 0.0 0.000000 0.000000 0.0 0.0 \n", "9 0.063835 0.0 0.012835 0.000606 0.0 0.0 \n", "10 0.066177 0.0 0.000000 0.000000 0.0 0.0 \n", "11 0.061181 0.0 0.064921 0.000000 0.0 0.0 \n", "12 0.000000 0.0 0.082699 0.000000 0.0 0.0 \n", "13 0.000000 0.0 1.000000 0.080108 0.0 0.0 \n", "14 0.000000 0.0 0.068587 0.000000 0.0 0.0 \n", "15 0.000000 0.0 0.293111 0.000000 0.0 0.0 \n", "16 0.000000 0.0 0.185996 0.000000 0.0 0.0 \n", "17 0.000000 0.0 0.024486 0.000000 0.0 0.0 \n", "18 0.000000 0.0 0.105864 0.000000 0.0 0.0 \n", "19 0.000000 0.0 0.148301 0.000000 0.0 0.0 \n", "20 0.000000 0.0 0.046693 0.000000 0.0 0.0 \n", "\n", " y_modloss_z1 y_modloss_z2 \n", "8 0.0 0.0 \n", "9 0.0 0.0 \n", "10 0.0 0.0 \n", "11 0.0 0.0 \n", "12 0.0 0.0 \n", "13 0.0 0.0 \n", "14 0.0 0.0 \n", "15 0.0 0.0 \n", "16 0.0 0.0 \n", "17 0.0 0.0 \n", "18 0.0 0.0 \n", "19 0.0 0.0 \n", "20 0.0 0.0 " ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "phos_precursor_id = 1 # we manually assigned this peptide as phospho\n", "inten_df.iloc[\n", " df.loc[phos_precursor_id,'frag_start_idx']:\n", " df.loc[phos_precursor_id,'frag_stop_idx'],:\n", "]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To disable this, use `ModelManager(mask_modloss=False, ...)`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2022-09-09 21:54:13> Predicting MS2 ...\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "100%|██████████| 5/5 [00:00<00:00, 86.70it/s]\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
b_z1b_z2y_z1y_z2b_modloss_z1b_modloss_z2y_modloss_z1y_modloss_z2
80.0000000.00.0000000.0000000.00.00.00.0
90.0638350.00.0128350.0006060.00.00.00.0
100.0661770.00.0000000.0000000.00.00.00.0
110.0611810.00.0649210.0000000.00.00.00.0
120.0000000.00.0826990.0000000.00.00.00.0
130.0000000.01.0000000.0801080.00.00.00.0
140.0000000.00.0685870.0000000.00.00.00.0
150.0000000.00.2931110.0000000.00.00.00.0
160.0000000.00.1859960.0000000.00.00.00.0
170.0000000.00.0244860.0000000.00.00.00.0
180.0000000.00.1058640.0000000.00.00.00.0
190.0000000.00.1483010.0000000.00.00.00.0
200.0000000.00.0466930.0000000.00.00.00.0
\n", "
" ], "text/plain": [ " b_z1 b_z2 y_z1 y_z2 b_modloss_z1 b_modloss_z2 \\\n", "8 0.000000 0.0 0.000000 0.000000 0.0 0.0 \n", "9 0.063835 0.0 0.012835 0.000606 0.0 0.0 \n", "10 0.066177 0.0 0.000000 0.000000 0.0 0.0 \n", "11 0.061181 0.0 0.064921 0.000000 0.0 0.0 \n", "12 0.000000 0.0 0.082699 0.000000 0.0 0.0 \n", "13 0.000000 0.0 1.000000 0.080108 0.0 0.0 \n", "14 0.000000 0.0 0.068587 0.000000 0.0 0.0 \n", "15 0.000000 0.0 0.293111 0.000000 0.0 0.0 \n", "16 0.000000 0.0 0.185996 0.000000 0.0 0.0 \n", "17 0.000000 0.0 0.024486 0.000000 0.0 0.0 \n", "18 0.000000 0.0 0.105864 0.000000 0.0 0.0 \n", "19 0.000000 0.0 0.148301 0.000000 0.0 0.0 \n", "20 0.000000 0.0 0.046693 0.000000 0.0 0.0 \n", "\n", " y_modloss_z1 y_modloss_z2 \n", "8 0.0 0.0 \n", "9 0.0 0.0 \n", "10 0.0 0.0 \n", "11 0.0 0.0 \n", "12 0.0 0.0 \n", "13 0.0 0.0 \n", "14 0.0 0.0 \n", "15 0.0 0.0 \n", "16 0.0 0.0 \n", "17 0.0 0.0 \n", "18 0.0 0.0 \n", "19 0.0 0.0 \n", "20 0.0 0.0 " ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model_mgr = ModelManager(mask_modloss=True, device='cpu')\n", "model_mgr.load_installed_models('phos')\n", "df = IRT_PEPTIDE_DF.copy()\n", "df.loc[1,'mods'] = 'Phospho@S'\n", "df.loc[1,'mod_sites'] = '5'\n", "df['charge'] = 2\n", "inten_df = model_mgr.predict_ms2(df)\n", "inten_df.iloc[\n", " df.loc[1,'frag_start_idx']:\n", " df.loc[1,'frag_stop_idx'],:\n", "]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3.8.3 ('base')", "language": "python", "name": "python3" }, "language_info": { "name": "python", "version": "3.8.3" }, "vscode": { "interpreter": { "hash": "8a3b27e141e49c996c9b863f8707e97aabd49c4a7e8445b9b783b34e4a21a9b2" } } }, "nbformat": 4, "nbformat_minor": 2 }