Article ; Online: Deep kernel learning improves molecular fingerprint prediction from tandem mass spectra.
Bioinformatics (Oxford, England)
2022 Volume 38, Issue Suppl 1, Page(s) i342–i349
Abstract: Motivation: Untargeted metabolomics experiments rely on spectral libraries for structure annotation, but these libraries are vastly incomplete; in silico methods search in structure databases, allowing us to overcome this limitation. The best-performing ...
Abstract | Motivation: Untargeted metabolomics experiments rely on spectral libraries for structure annotation, but these libraries are vastly incomplete; in silico methods search in structure databases, allowing us to overcome this limitation. The best-performing in silico methods use machine learning to predict a molecular fingerprint from tandem mass spectra, then use the predicted fingerprint to search in a molecular structure database. Predicted molecular fingerprints are also of great interest for compound class annotation, de novo structure elucidation, and other tasks. So far, kernel support vector machines are the best tool for fingerprint prediction. However, they cannot be trained on all publicly available reference spectra because their training time scales cubically with the number of training data. Results: We use the Nyström approximation to transform the kernel into a linear feature map. We evaluate two methods that use this feature map as input: a linear support vector machine and a deep neural network (DNN). For evaluation, we use a cross-validated dataset of 156 017 compounds and three independent datasets with 1734 compounds. We show that the combination of kernel method and DNN outperforms the kernel support vector machine, which is the current gold standard, as well as a DNN on tandem mass spectra on all evaluation datasets. Availability and implementation: The deep kernel learning method for fingerprint prediction is part of the SIRIUS software, available at https://bio.informatik.uni-jena.de/software/sirius. |
---|---|
MeSH term(s) | Databases, Chemical ; Machine Learning ; Metabolomics/methods ; Neural Networks, Computer ; Tandem Mass Spectrometry/methods |
Language | English |
Publishing date | 2022-06-18 |
Publishing country | England |
Document type | Journal Article ; Research Support, Non-U.S. Gov't |
ZDB-ID | 1422668-6 |
ISSN | 1367-4811 ; 1367-4803 |
ISSN (online) | 1367-4811 |
ISSN | 1367-4803 |
DOI | 10.1093/bioinformatics/btac260 |
Database | MEDical Literature Analysis and Retrieval System OnLINE |
More links
Kategorien
In stock of ZB MED Cologne/Königswinter
Zs.A 2374: Show issues | Location: Je nach Verfügbarkeit (siehe Angabe bei Bestand) bis Jg. 1994: Bestellungen von Artikeln über das Online-Bestellformular Jg. 1995 - 2021: Lesesall (2.OG) ab Jg. 2022: Lesesaal (EG) |
Order via subito
This service is chargeable due to the Delivery terms set by subito. Orders including an article and supplementary material will be classified as separate orders. In these cases, fees will be demanded for each order.