{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Tutorial: Predicting Spectral Library from Fasta" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%reload_ext autoreload\n", "%autoreload 2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Predict fasta libray and save as HDF file using this notebook.\n", "And then use [alphapeptdeep_hdf_to_tsv.ipynb](alphapeptdeep_hdf_to_tsv.ipynb) to translate hdf into tsv (diann/spectronaut) format." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Prepare the data and settings" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from alphabase.peptide.fragment import get_charged_frag_types\n", "import pandas as pd\n", "\n", "fasta_list = [\n", " r\"y:\\User\\Feng\\fasta\\uniprot_human_reviewed_20210309.fasta\"\n", "]\n", "# output spectral library in hdf format\n", "hdf_path = r'y:\\User\\Feng\\speclib\\human_swissprot.speclib.hdf'\n", "\n", "protease=\"trypsin\"\n", "nce = 30\n", "instrument = 'timsTOF'\n", "\n", "add_phos=False\n", "\n", "protease_dict = {\n", " \"trypsin\": \"([KR])\", # this is in fact the \"trypsin/P\"\n", " \"lysc\": \"([K])\",\n", " \"lysn\": \"\\w(?=K)\",\n", "}\n", "min_pep_len = 7\n", "max_pep_len = 35\n", "max_miss_cleave = 1\n", "max_var_mods = 1\n", "min_pep_mz = 400\n", "max_pep_mz = 1200\n", "precursor_charge_min = 2\n", "precursor_charge_max = 4\n", "\n", "var_mods = []\n", "var_mods += ['Oxidation@M']\n", "#var_mods += ['Phospho@S','Phospho@T','Phospho@Y']\n", "\n", "\n", "frag_types = get_charged_frag_types(\n", " ['b','y']+\n", " (['b_modloss','y_modloss'] if add_phos else []), \n", " 2\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "digest = protease_dict[protease] # Or digest = \"trypsin/P\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`protease` and `digest` are designed by regular expression. alphabase provides several built-in enzymes, we don't need to design the regular expression for most of the enzymes. Here are all the built-in enzymes:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'arg-c': 'R',\n", " 'asp-n': '\\\\w(?=D)',\n", " 'bnps-skatole': 'W',\n", " 'caspase 1': '(?<=[FWYL]\\\\w[HAT])D(?=[^PEDQKR])',\n", " 'caspase 2': '(?<=DVA)D(?=[^PEDQKR])',\n", " 'caspase 3': '(?<=DMQ)D(?=[^PEDQKR])',\n", " 'caspase 4': '(?<=LEV)D(?=[^PEDQKR])',\n", " 'caspase 5': '(?<=[LW]EH)D',\n", " 'caspase 6': '(?<=VE[HI])D(?=[^PEDQKR])',\n", " 'caspase 7': '(?<=DEV)D(?=[^PEDQKR])',\n", " 'caspase 8': '(?<=[IL]ET)D(?=[^PEDQKR])',\n", " 'caspase 9': '(?<=LEH)D',\n", " 'caspase 10': '(?<=IEA)D',\n", " 'chymotrypsin high specificity': '([FY](?=[^P]))|(W(?=[^MP]))',\n", " 'chymotrypsin low specificity': '([FLY](?=[^P]))|(W(?=[^MP]))|(M(?=[^PY]))|(H(?=[^DMPW]))',\n", " 'chymotrypsin': '([FLY](?=[^P]))|(W(?=[^MP]))|(M(?=[^PY]))|(H(?=[^DMPW]))',\n", " 'clostripain': 'R',\n", " 'cnbr': 'M',\n", " 'enterokinase': '(?<=[DE]{3})K',\n", " 'factor xa': '(?<=[AFGILTVM][DE]G)R',\n", " 'formic acid': 'D',\n", " 'glutamyl endopeptidase': 'E',\n", " 'glu-c': 'E',\n", " 'granzyme b': '(?<=IEP)D',\n", " 'hydroxylamine': 'N(?=G)',\n", " 'iodosobenzoic acid': 'W',\n", " 'lys-c': 'K',\n", " 'lys-n': '\\\\w(?=K)',\n", " 'ntcb': '\\\\w(?=C)',\n", " 'pepsin ph1.3': '((?<=[^HKR][^P])[^R](?=[FL][^P]))|((?<=[^HKR][^P])[FL](?=\\\\w[^P]))',\n", " 'pepsin ph2.0': '((?<=[^HKR][^P])[^R](?=[FLWY][^P]))|((?<=[^HKR][^P])[FLWY](?=\\\\w[^P]))',\n", " 'proline endopeptidase': '(?<=[HKR])P(?=[^P])',\n", " 'proteinase k': '[AEFILTVWY]',\n", " 'staphylococcal peptidase i': '(?<=[^E])E',\n", " 'thermolysin': '[^DE](?=[AFILMV])',\n", " 'thrombin': '((?<=G)R(?=G))|((?<=[AFGILTVM][AFGILTVWA]P)R(?=[^DE][^DE]))',\n", " 'trypsin_full': '([KR](?=[^P]))|((?<=W)K(?=P))|((?<=M)R(?=P))',\n", " 'trypsin_exception': '((?<=[CD])K(?=D))|((?<=C)K(?=[HY]))|((?<=C)R(?=K))|((?<=R)R(?=[HR]))',\n", " 'trypsin': '([KR](?=[^P]))',\n", " 'trypsin/P': '([KR])',\n", " 'non-specific': '()',\n", " 'no-cleave': '_'}" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from alphabase.protein.fasta import protease_dict\n", "protease_dict" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Initialize a `PredictSpecLibFasta` object" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from peptdeep.protein.fasta import PredictSpecLibFasta\n", "from peptdeep.pretrained_models import ModelManager\n", "\n", "model_mgr = ModelManager(device='gpu')\n", "\n", "model_mgr.nce = nce\n", "model_mgr.instrument = instrument\n", "\n", "fasta_lib = PredictSpecLibFasta(\n", " model_mgr, \n", " protease=digest,\n", " charged_frag_types=frag_types, \n", " var_mods=var_mods, \n", " fix_mods=['Carbamidomethyl@C'],\n", " max_missed_cleavages=max_miss_cleave,\n", " max_var_mod_num=max_var_mods,\n", " peptide_length_max=max_pep_len,\n", " peptide_length_min=min_pep_len,\n", " precursor_charge_min=precursor_charge_min,\n", " precursor_charge_max=precursor_charge_max,\n", " precursor_mz_min=min_pep_mz,\n", " precursor_mz_max=max_pep_mz,\n", " decoy=None\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Digest" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fasta_lib.get_peptides_from_fasta_list(fasta_list)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we have a sequence DataFrame (`seq_df`) containing peptide sequences in the `sequence` column, we can skip `get_peptides_from_fasta_list`. Just assign `seq_df` to `fasta_lib._precursor_df` and perform all following steps.\n", "\n", "```\n", "fasta_lib._precursor_df = seq_df\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Append decoy sequences and add modifications" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fasta_lib.append_decoy_sequence()\n", "fasta_lib.add_modifications()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will get a protein DataFrame (`protein_df`) after digestion" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
| \n", " | protein_id | \n", "full_name | \n", "gene_name | \n", "description | \n", "sequence | \n", "
|---|---|---|---|---|---|
| 0 | \n", "Q9H9K5 | \n", "sp|Q9H9K5|MER34_HUMAN | \n", "ERVMER34-1 | \n", "sp|Q9H9K5|MER34_HUMAN Endogenous retroviral en... | \n", "MGSLSNYALLQLTLTAFLTILVQPQHLLAPVFRTLSILTNQSNCWL... | \n", "
| 1 | \n", "P04439 | \n", "sp|P04439|HLAA_HUMAN | \n", "HLA-A | \n", "sp|P04439|HLAA_HUMAN HLA class I histocompatib... | \n", "MAVMAPRTLLLLLSGALALTQTWAGSHSMRYFFTSVSRPGRGEPRF... | \n", "
| 2 | \n", "P01911 | \n", "sp|P01911|DRB1_HUMAN | \n", "HLA-DRB1 | \n", "sp|P01911|DRB1_HUMAN HLA class II histocompati... | \n", "MVCLKLPGGSCMTALTVTLMVLSSPLALSGDTRPRFLWQPKRECHF... | \n", "
| 3 | \n", "P01889 | \n", "sp|P01889|HLAB_HUMAN | \n", "HLA-B | \n", "sp|P01889|HLAB_HUMAN HLA class I histocompatib... | \n", "MLVMAPRTVLLLLSAALALTETWAGSHSMRYFYTSVSRPGRGEPRF... | \n", "
| 4 | \n", "P31689 | \n", "sp|P31689|DNJA1_HUMAN | \n", "DNAJA1 | \n", "sp|P31689|DNJA1_HUMAN DnaJ homolog subfamily A... | \n", "MVKETTYYDVLGVKPNATQEELKKAYRKLALKYHPDKNPNEGEKFK... | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 20391 | \n", "Q8WVZ7 | \n", "sp|Q8WVZ7|RN133_HUMAN | \n", "RNF133 | \n", "sp|Q8WVZ7|RN133_HUMAN E3 ubiquitin-protein lig... | \n", "MHLLKVGTWRNNTASSWLMKFSVLWLVSQNCCRASVVWMAYMNISF... | \n", "
| 20392 | \n", "P05387 | \n", "sp|P05387|RLA2_HUMAN | \n", "RPLP2 | \n", "sp|P05387|RLA2_HUMAN 60S acidic ribosomal prot... | \n", "MRYVASYLLAALGGNSSPSAKDIKKILDSVGIEADDDRLNKVISEL... | \n", "
| 20393 | \n", "P51991 | \n", "sp|P51991|ROA3_HUMAN | \n", "HNRNPA3 | \n", "sp|P51991|ROA3_HUMAN Heterogeneous nuclear rib... | \n", "MEVKPPPGRPQPDSGRRRRRRGEEGHDPKEPEQLRKLFIGGLSFET... | \n", "
| 20394 | \n", "Q9BZX4 | \n", "sp|Q9BZX4|ROP1B_HUMAN | \n", "ROPN1B | \n", "sp|Q9BZX4|ROP1B_HUMAN Ropporin-1B OS=Homo sapi... | \n", "MAQTDKPTCIPPELPKMLKEFAKAAIRAQPQDLIQWGADYFEALSR... | \n", "
| 20395 | \n", "P34096 | \n", "sp|P34096|RNAS4_HUMAN | \n", "RNASE4 | \n", "sp|P34096|RNAS4_HUMAN Ribonuclease 4 OS=Homo s... | \n", "MALQRTHSLLLLLLLTLLGLGLVQPSYGQDGMYQRFLRQHVHPEET... | \n", "
20396 rows × 5 columns
\n", "| \n", " | sequence | \n", "protein_idxes | \n", "miss_cleavage | \n", "is_prot_nterm | \n", "is_prot_cterm | \n", "mods | \n", "mod_sites | \n", "nAA | \n", "charge | \n", "
|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", "RIHTGQR | \n", "19786 | \n", "1 | \n", "False | \n", "False | \n", "\n", " | \n", " | 7 | \n", "2 | \n", "
| 1 | \n", "RIHTGQR | \n", "19786 | \n", "1 | \n", "False | \n", "False | \n", "\n", " | \n", " | 7 | \n", "3 | \n", "
| 2 | \n", "RIHTGQR | \n", "19786 | \n", "1 | \n", "False | \n", "False | \n", "\n", " | \n", " | 7 | \n", "4 | \n", "
| 3 | \n", "LVDSAYK | \n", "12819 | \n", "0 | \n", "False | \n", "False | \n", "\n", " | \n", " | 7 | \n", "2 | \n", "
| 4 | \n", "LVDSAYK | \n", "12819 | \n", "0 | \n", "False | \n", "False | \n", "\n", " | \n", " | 7 | \n", "3 | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 5617819 | \n", "KNQAADDDDEDLNDTNYDEFNGYAGSLFSSGPYEK | \n", "2299 | \n", "1 | \n", "False | \n", "False | \n", "\n", " | \n", " | 35 | \n", "3 | \n", "
| 5617820 | \n", "KNQAADDDDEDLNDTNYDEFNGYAGSLFSSGPYEK | \n", "2299 | \n", "1 | \n", "False | \n", "False | \n", "\n", " | \n", " | 35 | \n", "4 | \n", "
| 5617821 | \n", "AYDADSGFNGKVLFTISDGNTDSCFNIDMETGQLK | \n", "10080 | \n", "1 | \n", "False | \n", "False | \n", "Carbamidomethyl@C | \n", "24 | \n", "35 | \n", "2 | \n", "
| 5617822 | \n", "AYDADSGFNGKVLFTISDGNTDSCFNIDMETGQLK | \n", "10080 | \n", "1 | \n", "False | \n", "False | \n", "Carbamidomethyl@C | \n", "24 | \n", "35 | \n", "3 | \n", "
| 5617823 | \n", "AYDADSGFNGKVLFTISDGNTDSCFNIDMETGQLK | \n", "10080 | \n", "1 | \n", "False | \n", "False | \n", "Carbamidomethyl@C | \n", "24 | \n", "35 | \n", "4 | \n", "
5617824 rows × 9 columns
\n", "| \n", " | sequence | \n", "protein_idxes | \n", "miss_cleavage | \n", "is_prot_nterm | \n", "is_prot_cterm | \n", "mods | \n", "mod_sites | \n", "nAA | \n", "charge | \n", "mod_seq_hash | \n", "mod_seq_charge_hash | \n", "precursor_mz | \n", "
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", "RIHTGQR | \n", "19786 | \n", "1 | \n", "False | \n", "False | \n", "\n", " | \n", " | 7 | \n", "2 | \n", "471662500970219628 | \n", "471662500970219630 | \n", "434.249018 | \n", "
| 1 | \n", "PMPMPVR | \n", "9448 | \n", "0 | \n", "False | \n", "False | \n", "\n", " | \n", " | 7 | \n", "2 | \n", "-5301076820607700090 | \n", "-5301076820607700088 | \n", "414.216952 | \n", "
| 2 | \n", "PMPMPVR | \n", "9448 | \n", "0 | \n", "False | \n", "False | \n", "Oxidation@M | \n", "4 | \n", "7 | \n", "2 | \n", "6057464136741449831 | \n", "6057464136741449833 | \n", "422.214409 | \n", "
| 3 | \n", "PMPMPVR | \n", "9448 | \n", "0 | \n", "False | \n", "False | \n", "Oxidation@M | \n", "2 | \n", "7 | \n", "2 | \n", "-6431722582867031756 | \n", "-6431722582867031754 | \n", "422.214409 | \n", "
| 4 | \n", "QEWFCTR | \n", "12819 | \n", "0 | \n", "False | \n", "False | \n", "Carbamidomethyl@C | \n", "5 | \n", "7 | \n", "2 | \n", "-7409729050206298801 | \n", "-7409729050206298799 | \n", "513.726727 | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 3654202 | \n", "NLTYVRGSVGPATSTLMFVAGVVGNGLALGILSAR | \n", "978 | \n", "1 | \n", "False | \n", "False | \n", "\n", " | \n", " | 35 | \n", "4 | \n", "7192344052213098704 | \n", "7192344052213098708 | \n", "866.228888 | \n", "
| 3654203 | \n", "NLTYVRGSVGPATSTLMFVAGVVGNGLALGILSAR | \n", "978 | \n", "1 | \n", "False | \n", "False | \n", "Oxidation@M | \n", "17 | \n", "35 | \n", "3 | \n", "-1485306056792248111 | \n", "-1485306056792248108 | \n", "1159.967730 | \n", "
| 3654204 | \n", "NLTYVRGSVGPATSTLMFVAGVVGNGLALGILSAR | \n", "978 | \n", "1 | \n", "False | \n", "False | \n", "Oxidation@M | \n", "17 | \n", "35 | \n", "4 | \n", "-1485306056792248111 | \n", "-1485306056792248107 | \n", "870.227616 | \n", "
| 3654205 | \n", "KNQAADDDDEDLNDTNYDEFNGYAGSLFSSGPYEK | \n", "2299 | \n", "1 | \n", "False | \n", "False | \n", "\n", " | \n", " | 35 | \n", "4 | \n", "5191231126132273751 | \n", "5191231126132273755 | \n", "976.910866 | \n", "
| 3654206 | \n", "AYDADSGFNGKVLFTISDGNTDSCFNIDMETGQLK | \n", "10080 | \n", "1 | \n", "False | \n", "False | \n", "Carbamidomethyl@C | \n", "24 | \n", "35 | \n", "4 | \n", "-7707559913944666938 | \n", "-7707559913944666934 | \n", "958.434460 | \n", "
3654207 rows × 12 columns
\n", "| \n", " | b_z1 | \n", "b_z2 | \n", "y_z1 | \n", "y_z2 | \n", "
|---|---|---|---|---|
| 0 | \n", "0.000000 | \n", "0.0 | \n", "0.611678 | \n", "0.0 | \n", "
| 1 | \n", "0.056326 | \n", "0.0 | \n", "1.000000 | \n", "0.0 | \n", "
| 2 | \n", "0.437313 | \n", "0.0 | \n", "0.729849 | \n", "0.0 | \n", "
| 3 | \n", "0.219575 | \n", "0.0 | \n", "0.292181 | \n", "0.0 | \n", "
| 4 | \n", "0.346306 | \n", "0.0 | \n", "0.033992 | \n", "0.0 | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 60404997 | \n", "0.000000 | \n", "0.0 | \n", "0.322072 | \n", "0.0 | \n", "
| 60404998 | \n", "0.000000 | \n", "0.0 | \n", "0.206371 | \n", "0.0 | \n", "
| 60404999 | \n", "0.000000 | \n", "0.0 | \n", "0.033532 | \n", "0.0 | \n", "
| 60405000 | \n", "0.000000 | \n", "0.0 | \n", "0.040032 | \n", "0.0 | \n", "
| 60405001 | \n", "0.000000 | \n", "0.0 | \n", "0.000000 | \n", "0.0 | \n", "
60405002 rows × 4 columns
\n", "| \n", " | b_z1 | \n", "b_z2 | \n", "y_z1 | \n", "y_z2 | \n", "
|---|---|---|---|---|
| 0 | \n", "157.108387 | \n", "79.057832 | \n", "711.389648 | \n", "356.198462 | \n", "
| 1 | \n", "270.192451 | \n", "135.599864 | \n", "598.305584 | \n", "299.656430 | \n", "
| 2 | \n", "407.251363 | \n", "204.129320 | \n", "461.246672 | \n", "231.126974 | \n", "
| 3 | \n", "508.299042 | \n", "254.653159 | \n", "360.198993 | \n", "180.603135 | \n", "
| 4 | \n", "565.320506 | \n", "283.163891 | \n", "303.177530 | \n", "152.092403 | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 60404997 | \n", "3285.398701 | \n", "1643.202989 | \n", "546.324588 | \n", "273.665932 | \n", "
| 60404998 | \n", "3386.446379 | \n", "1693.726828 | \n", "445.276909 | \n", "223.142093 | \n", "
| 60404999 | \n", "3443.467843 | \n", "1722.237560 | \n", "388.255446 | \n", "194.631361 | \n", "
| 60405000 | \n", "3571.526420 | \n", "1786.266848 | \n", "260.196868 | \n", "130.602072 | \n", "
| 60405001 | \n", "3684.610484 | \n", "1842.808880 | \n", "147.112804 | \n", "74.060040 | \n", "
60405002 rows × 4 columns
\n", "| \n", " | sequence | \n", "protein_idxes | \n", "miss_cleavage | \n", "is_prot_nterm | \n", "is_prot_cterm | \n", "mods | \n", "mod_sites | \n", "nAA | \n", "charge | \n", "mod_seq_hash | \n", "... | \n", "precursor_mz | \n", "instrument | \n", "nce | \n", "rt_pred | \n", "rt_norm_pred | \n", "ccs_pred | \n", "mobility_pred | \n", "frag_stop_idx | \n", "frag_start_idx | \n", "irt_pred | \n", "
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", "RIHTGQR | \n", "19786 | \n", "1 | \n", "False | \n", "False | \n", "\n", " | \n", " | 7 | \n", "2 | \n", "471662500970219628 | \n", "... | \n", "434.249018 | \n", "timsTOF | \n", "30 | \n", "0.115377 | \n", "0.115377 | \n", "315.529022 | \n", "0.775438 | \n", "6 | \n", "0 | \n", "-37.187631 | \n", "
| 1 | \n", "PMPMPVR | \n", "9448 | \n", "0 | \n", "False | \n", "False | \n", "\n", " | \n", " | 7 | \n", "2 | \n", "-5301076820607700090 | \n", "... | \n", "414.216952 | \n", "timsTOF | \n", "30 | \n", "0.208976 | \n", "0.208976 | \n", "304.965790 | \n", "0.748912 | \n", "12 | \n", "6 | \n", "-16.331142 | \n", "
| 2 | \n", "PMPMPVR | \n", "9448 | \n", "0 | \n", "False | \n", "False | \n", "Oxidation@M | \n", "4 | \n", "7 | \n", "2 | \n", "6057464136741449831 | \n", "... | \n", "422.214409 | \n", "timsTOF | \n", "30 | \n", "0.158058 | \n", "0.158058 | \n", "304.080536 | \n", "0.746970 | \n", "18 | \n", "12 | \n", "-27.677099 | \n", "
| 3 | \n", "PMPMPVR | \n", "9448 | \n", "0 | \n", "False | \n", "False | \n", "Oxidation@M | \n", "2 | \n", "7 | \n", "2 | \n", "-6431722582867031756 | \n", "... | \n", "422.214409 | \n", "timsTOF | \n", "30 | \n", "0.157143 | \n", "0.157143 | \n", "305.825348 | \n", "0.751256 | \n", "24 | \n", "18 | \n", "-27.881022 | \n", "
| 4 | \n", "QEWFCTR | \n", "12819 | \n", "0 | \n", "False | \n", "False | \n", "Carbamidomethyl@C | \n", "5 | \n", "7 | \n", "2 | \n", "-7409729050206298801 | \n", "... | \n", "513.726727 | \n", "timsTOF | \n", "30 | \n", "0.423747 | \n", "0.423747 | \n", "330.547638 | \n", "0.814317 | \n", "30 | \n", "24 | \n", "31.526291 | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 3654202 | \n", "NLTYVRGSVGPATSTLMFVAGVVGNGLALGILSAR | \n", "978 | \n", "1 | \n", "False | \n", "False | \n", "\n", " | \n", " | 35 | \n", "4 | \n", "7192344052213098704 | \n", "... | \n", "866.228888 | \n", "timsTOF | \n", "30 | \n", "0.831350 | \n", "0.831350 | \n", "891.748413 | \n", "1.108824 | \n", "60404866 | \n", "60404832 | \n", "122.352159 | \n", "
| 3654203 | \n", "NLTYVRGSVGPATSTLMFVAGVVGNGLALGILSAR | \n", "978 | \n", "1 | \n", "False | \n", "False | \n", "Oxidation@M | \n", "17 | \n", "35 | \n", "3 | \n", "-1485306056792248111 | \n", "... | \n", "1159.967730 | \n", "timsTOF | \n", "30 | \n", "0.826977 | \n", "0.826977 | \n", "785.478699 | \n", "1.302269 | \n", "60404900 | \n", "60404866 | \n", "121.377815 | \n", "
| 3654204 | \n", "NLTYVRGSVGPATSTLMFVAGVVGNGLALGILSAR | \n", "978 | \n", "1 | \n", "False | \n", "False | \n", "Oxidation@M | \n", "17 | \n", "35 | \n", "4 | \n", "-1485306056792248111 | \n", "... | \n", "870.227616 | \n", "timsTOF | \n", "30 | \n", "0.826977 | \n", "0.826977 | \n", "892.459656 | \n", "1.109729 | \n", "60404934 | \n", "60404900 | \n", "121.377815 | \n", "
| 3654205 | \n", "KNQAADDDDEDLNDTNYDEFNGYAGSLFSSGPYEK | \n", "2299 | \n", "1 | \n", "False | \n", "False | \n", "\n", " | \n", " | 35 | \n", "4 | \n", "5191231126132273751 | \n", "... | \n", "976.910866 | \n", "timsTOF | \n", "30 | \n", "0.670129 | \n", "0.670129 | \n", "791.322266 | \n", "0.984398 | \n", "60404968 | \n", "60404934 | \n", "86.427514 | \n", "
| 3654206 | \n", "AYDADSGFNGKVLFTISDGNTDSCFNIDMETGQLK | \n", "10080 | \n", "1 | \n", "False | \n", "False | \n", "Carbamidomethyl@C | \n", "24 | \n", "35 | \n", "4 | \n", "-7707559913944666938 | \n", "... | \n", "958.434460 | \n", "timsTOF | \n", "30 | \n", "0.725150 | \n", "0.725150 | \n", "823.819214 | \n", "1.024754 | \n", "60405002 | \n", "60404968 | \n", "98.687774 | \n", "
3654207 rows × 21 columns
\n", "