MS Annika: A new Search Engine for Identifying Peptides from MS-cleavable Crosslink Data
G. J. Pirklbauer, C. Stieger, D. M. Borgmann, S. M. Winkler, K. Mechtler, V. Dorfer - MS Annika: A new Search Engine for Identifying Peptides from MS-cleavable Crosslink Data - Proceedings of the 2019 EuBIC Winter School on proteomics bioinformatics, Zakopane, Polen, 2019
The interest in crosslinking mass spectrometry has risen steadily over the last few years, as has the quality of data and software tools to analyse them . A great improvement came with the development of cross-linkers that are cleavable upon collisional induced dissociation . These linkers enable confident selection of spectra containing cross-linked peptides and provide information for identification.
Here, we present MS Annika, a novel algorithm for the identification of crosslink-spectrum matches (CSMs) from tandem mass spectrometry experiments. MS Annika is specialized on MS cleavable linkers. It is designed to integrate into Proteome Discoverer (Version 2.3), thus eliminating the need for pre-processing steps. The MS Annika algorithm is divided into three stages:
In the first step, MS Annika uses cross-link specific fragment ions, so-called crosslink reporter doublets, to select crosslink spectra. These reporter doublets correspond to the two cross-linked peptides, each of them modified with the heavy and the light part of the cleaved linker. The algorithm also allows for the selection of spectra with incomplete doublets, to increase the number of potential identifications. Based on these doublets, the theoretical precursor masses of the two peptides are identified.
Secondly, a modified version of the MS Amanda  database search engine algorithm provides multiple peptide sequences for both precursors. The highest scoring peptides for each precursor are combined to create CSMs. Subsequently, the CSMs are grouped into crosslinks by their cross-linked amino acid site.
The third step comprises a target-decoy based validation. False discovery rates are calculated at CSM as well as crosslink level, resulting in robust identifications.
First results show that MS Annika is able to compete with other tools in the field, both in speed and the number and sensitivity of identifications. For example, we ran both MeroX  and MS Annika with default parameters, allowing the DSSO linker to bind to lysine, serine, threonine and tyrosine as well as the protein N-terminus, using carbamidomethylation of C as a static and oxidation of M as a variable modification in a sample with two proteins. From 14708 spectra measured on a Thermo Fischer Q-Exactive HF mass spectrometer, MeroX identified 234, while MS Annika identified 282 CSMs at an FDR cut-off of 5%.