LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Your last searches

  1. AU="Dührkop, Kai"
  2. AU="Rupert Palme"

Search results

Result 1 - 10 of total 34

Search options

  1. Article ; Online: Deep kernel learning improves molecular fingerprint prediction from tandem mass spectra.

    Dührkop, Kai

    Bioinformatics (Oxford, England)

    2022  Volume 38, Issue Suppl 1, Page(s) i342–i349

    Abstract: Motivation: Untargeted metabolomics experiments rely on spectral libraries for structure annotation, but these libraries are vastly incomplete; in silico methods search in structure databases, allowing us to overcome this limitation. The best-performing ...

    Abstract Motivation: Untargeted metabolomics experiments rely on spectral libraries for structure annotation, but these libraries are vastly incomplete; in silico methods search in structure databases, allowing us to overcome this limitation. The best-performing in silico methods use machine learning to predict a molecular fingerprint from tandem mass spectra, then use the predicted fingerprint to search in a molecular structure database. Predicted molecular fingerprints are also of great interest for compound class annotation, de novo structure elucidation, and other tasks. So far, kernel support vector machines are the best tool for fingerprint prediction. However, they cannot be trained on all publicly available reference spectra because their training time scales cubically with the number of training data.
    Results: We use the Nyström approximation to transform the kernel into a linear feature map. We evaluate two methods that use this feature map as input: a linear support vector machine and a deep neural network (DNN). For evaluation, we use a cross-validated dataset of 156 017 compounds and three independent datasets with 1734 compounds. We show that the combination of kernel method and DNN outperforms the kernel support vector machine, which is the current gold standard, as well as a DNN on tandem mass spectra on all evaluation datasets.
    Availability and implementation: The deep kernel learning method for fingerprint prediction is part of the SIRIUS software, available at https://bio.informatik.uni-jena.de/software/sirius.
    MeSH term(s) Databases, Chemical ; Machine Learning ; Metabolomics/methods ; Neural Networks, Computer ; Tandem Mass Spectrometry/methods
    Language English
    Publishing date 2022-06-18
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 1422668-6
    ISSN 1367-4811 ; 1367-4803
    ISSN (online) 1367-4811
    ISSN 1367-4803
    DOI 10.1093/bioinformatics/btac260
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  2. Article ; Online: Combining Experimental with Computational Infrared and Mass Spectra for High-Throughput Nontargeted Chemical Structure Identification

    Karunaratne, Erandika / Hill, Dennis W. / Dührkop, Kai / Böcker, Sebastian / Grant, David F.

    Analytical Chemistry. 2023 Aug. 04, v. 95, no. 32 p.11901-11907

    2023  

    Abstract: The inability to identify the structures of most metabolites detected in environmental or biological samples limits the utility of nontargeted metabolomics. The most widely used analytical approaches combine mass spectrometry and machine learning methods ...

    Abstract The inability to identify the structures of most metabolites detected in environmental or biological samples limits the utility of nontargeted metabolomics. The most widely used analytical approaches combine mass spectrometry and machine learning methods to rank candidate structures contained in large chemical databases. Given the large chemical space typically searched, the use of additional orthogonal data may improve the identification rates and reliability. Here, we present results of combining experimental and computational mass and IR spectral data for high-throughput nontargeted chemical structure identification. Experimental MS/MS and gas-phase IR data for 148 test compounds were obtained from NIST. Candidate structures for each of the test compounds were obtained from PubChem (mean = 4444 candidate structures per test compound). Our workflow used CSI:FingerID to initially score and rank the candidate structures. The top 1000 ranked candidates were subsequently used for IR spectra prediction, scoring, and ranking using density functional theory (DFT-IR). Final ranking of the candidates was based on a composite score calculated as the average of the CSI:FingerID and DFT-IR rankings. This approach resulted in the correct identification of 88 of the 148 test compounds (59%). 129 of the 148 test compounds (87%) were ranked within the top 20 candidates. These identification rates are the highest yet reported when candidate structures are used from PubChem. Combining experimental and computational MS/MS and IR spectral data is a potentially powerful option for prioritizing candidates for final structure verification.
    Keywords analytical chemistry ; chemical structure ; density functional theory ; mass spectrometry ; metabolites ; metabolomics ; prediction ; spectral analysis
    Language English
    Dates of publication 2023-0804
    Size p. 11901-11907.
    Publishing place American Chemical Society
    Document type Article ; Online
    ZDB-ID 1508-8
    ISSN 1520-6882 ; 0003-2700
    ISSN (online) 1520-6882
    ISSN 0003-2700
    DOI 10.1021/acs.analchem.3c00937
    Database NAL-Catalogue (AGRICOLA)

    More links

    Kategorien

  3. Article ; Online: MSNovelist: de novo structure generation from mass spectra.

    Stravs, Michael A / Dührkop, Kai / Böcker, Sebastian / Zamboni, Nicola

    Nature methods

    2022  Volume 19, Issue 7, Page(s) 865–870

    Abstract: Current methods for structure elucidation of small molecules rely on finding similarity with spectra of known compounds, but do not predict structures de novo for unknown compound classes. We present MSNovelist, which combines fingerprint prediction with ...

    Abstract Current methods for structure elucidation of small molecules rely on finding similarity with spectra of known compounds, but do not predict structures de novo for unknown compound classes. We present MSNovelist, which combines fingerprint prediction with an encoder-decoder neural network to generate structures de novo solely from tandem mass spectrometry (MS
    MeSH term(s) Databases, Factual ; Tandem Mass Spectrometry
    Language English
    Publishing date 2022-05-30
    Publishing country United States
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 2169522-2
    ISSN 1548-7105 ; 1548-7091
    ISSN (online) 1548-7105
    ISSN 1548-7091
    DOI 10.1038/s41592-022-01486-3
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  4. Article ; Online: Combining Experimental with Computational Infrared and Mass Spectra for High-Throughput Nontargeted Chemical Structure Identification.

    Karunaratne, Erandika / Hill, Dennis W / Dührkop, Kai / Böcker, Sebastian / Grant, David F

    Analytical chemistry

    2023  Volume 95, Issue 32, Page(s) 11901–11907

    Abstract: The inability to identify the structures of most metabolites detected in environmental or biological samples limits the utility of nontargeted metabolomics. The most widely used analytical approaches combine mass spectrometry and machine learning methods ...

    Abstract The inability to identify the structures of most metabolites detected in environmental or biological samples limits the utility of nontargeted metabolomics. The most widely used analytical approaches combine mass spectrometry and machine learning methods to rank candidate structures contained in large chemical databases. Given the large chemical space typically searched, the use of additional orthogonal data may improve the identification rates and reliability. Here, we present results of combining experimental and computational mass and IR spectral data for high-throughput nontargeted chemical structure identification. Experimental MS/MS and gas-phase IR data for 148 test compounds were obtained from NIST. Candidate structures for each of the test compounds were obtained from PubChem (mean = 4444 candidate structures per test compound). Our workflow used CSI:FingerID to initially score and rank the candidate structures. The top 1000 ranked candidates were subsequently used for IR spectra prediction, scoring, and ranking using density functional theory (DFT-IR). Final ranking of the candidates was based on a composite score calculated as the average of the CSI:FingerID and DFT-IR rankings. This approach resulted in the correct identification of 88 of the 148 test compounds (59%). 129 of the 148 test compounds (87%) were ranked within the top 20 candidates. These identification rates are the highest yet reported when candidate structures are used from PubChem. Combining experimental and computational MS/MS and IR spectral data is a potentially powerful option for prioritizing candidates for final structure verification.
    MeSH term(s) Tandem Mass Spectrometry ; Reproducibility of Results ; Databases, Chemical ; Metabolomics/methods ; Machine Learning
    Language English
    Publishing date 2023-08-04
    Publishing country United States
    Document type Journal Article ; Research Support, N.I.H., Extramural ; Research Support, Non-U.S. Gov't
    ZDB-ID 1508-8
    ISSN 1520-6882 ; 0003-2700
    ISSN (online) 1520-6882
    ISSN 0003-2700
    DOI 10.1021/acs.analchem.3c00937
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  5. Article: Mass Difference Matching Unfolds Hidden Molecular Structures of Dissolved Organic Matter

    Simon, Carsten / Dührkop, Kai / Petras, Daniel / Roth, Vanessa-Nina / Böcker, Sebastian / Dorrestein, Pieter C. / Gleixner, Gerd

    Environmental science & technology. 2022 July 14, v. 56, no. 15

    2022  

    Abstract: Ultrahigh-resolution Fourier transform mass spectrometry (FTMS) has revealed unprecedented details of natural complex mixtures such as dissolved organic matter (DOM) on a molecular formula level, but we lack approaches to access the underlying structural ...

    Abstract Ultrahigh-resolution Fourier transform mass spectrometry (FTMS) has revealed unprecedented details of natural complex mixtures such as dissolved organic matter (DOM) on a molecular formula level, but we lack approaches to access the underlying structural complexity. We here explore the hypothesis that every DOM precursor ion is potentially linked with all emerging product ions in FTMS² experiments. The resulting mass difference (Δm) matrix is deconvoluted to isolate individual precursor ion Δm profiles and matched with structural information, which was derived from 42 Δm features from 14 in-house reference compounds and a global set of 11 477 Δm features with assigned structure specificities, using a dataset of ∼18 000 unique structures. We show that Δm matching is highly sensitive in predicting potential precursor ion identities in terms of molecular and structural composition. Additionally, the approach identified unresolved precursor ions and missing elements in molecular formula annotation (P, Cl, F). Our study provides first results on how Δm matching refines structural annotations in van Krevelen space but simultaneously demonstrates the wide overlap between potential structural classes. We show that this effect is likely driven by chemodiversity and offers an explanation for the observed ubiquitous presence of molecules in the center of the van Krevelen space. Our promising first results suggest that Δm matching can both unfold the structural information encrypted in DOM and assess the quality of FTMS-derived molecular formulas of complex mixtures in general.
    Keywords data collection ; dissolved organic matter ; environmental science ; mass spectrometry ; technology
    Language English
    Dates of publication 2022-0714
    Size p. 11027-11040.
    Publishing place American Chemical Society
    Document type Article
    ISSN 1520-5851
    DOI 10.1021/acs.est.2c01332
    Database NAL-Catalogue (AGRICOLA)

    More links

    Kategorien

  6. Article: Fragmentation trees reloaded.

    Böcker, Sebastian / Dührkop, Kai

    Journal of cheminformatics

    2016  Volume 8, Page(s) 5

    Abstract: Background: Untargeted metabolomics commonly uses liquid chromatography mass spectrometry to measure abundances of metabolites; subsequent tandem mass spectrometry is used to derive information about individual compounds. One of the bottlenecks in this ... ...

    Abstract Background: Untargeted metabolomics commonly uses liquid chromatography mass spectrometry to measure abundances of metabolites; subsequent tandem mass spectrometry is used to derive information about individual compounds. One of the bottlenecks in this experimental setup is the interpretation of fragmentation spectra to accurately and efficiently identify compounds. Fragmentation trees have become a powerful tool for the interpretation of tandem mass spectrometry data of small molecules. These trees are determined from the data using combinatorial optimization, and aim at explaining the experimental data via fragmentation cascades. Fragmentation tree computation does not require spectral or structural databases. To obtain biochemically meaningful trees, one needs an elaborate optimization function (scoring).
    Results: We present a new scoring for computing fragmentation trees, transforming the combinatorial optimization into a Maximum A Posteriori estimator. We demonstrate the superiority of the new scoring for two tasks: both for the de novo identification of molecular formulas of unknown compounds, and for searching a database for structurally similar compounds, our method SIRIUS 3, performs significantly better than the previous version of our method, as well as other methods for this task.
    Conclusion: SIRIUS 3 can be a part of an untargeted metabolomics workflow, allowing researchers to investigate unknowns using automated computational methods.Graphical abstractWe present a new scoring for computing fragmentation trees from tandem mass spectrometry data based on Bayesian statistics. The best scoring fragmentation tree most likely explains the molecular formula of the measured parent ion.
    Language English
    Publishing date 2016-02-01
    Publishing country England
    Document type Journal Article
    ZDB-ID 2486539-4
    ISSN 1758-2946
    ISSN 1758-2946
    DOI 10.1186/s13321-016-0116-8
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  7. Article: Bayesian networks for mass spectrometric metabolite identification via molecular fingerprints

    Ludwig, Marcus / Dührkop, Kai / Böcker, Sebastian

    Bioinformatics. 2018 July 01, v. 34, no. 13

    2018  

    Abstract: Metabolites, small molecules that are involved in cellular reactions, provide a direct functional signature of cellular state. Untargeted metabolomics experiments usually rely on tandem mass spectrometry to identify the thousands of compounds in a ... ...

    Abstract Metabolites, small molecules that are involved in cellular reactions, provide a direct functional signature of cellular state. Untargeted metabolomics experiments usually rely on tandem mass spectrometry to identify the thousands of compounds in a biological sample. Recently, we presented CSI:FingerID for searching in molecular structure databases using tandem mass spectrometry data. CSI:FingerID predicts a molecular fingerprint that encodes the structure of the query compound, then uses this to search a molecular structure database such as PubChem. Scoring of the predicted query fingerprint and deterministic target fingerprints is carried out assuming independence between the molecular properties constituting the fingerprint. We present a scoring that takes into account dependencies between molecular properties. As before, we predict posterior probabilities of molecular properties using machine learning. Dependencies between molecular properties are modeled as a Bayesian tree network; the tree structure is estimated on the fly from the instance data. For each edge, we also estimate the expected covariance between the two random variables. For fixed marginal probabilities, we then estimate conditional probabilities using the known covariance. Now, the corrected posterior probability of each candidate can be computed, and candidates are ranked by this score. Modeling dependencies improves identification rates of CSI:FingerID by 2.85 percentage points. The new scoring Bayesian (fixed tree) is integrated into SIRIUS 4.0 (https://bio.informatik.uni-jena.de/software/sirius/).
    Keywords Bayesian theory ; artificial intelligence ; bioinformatics ; chemical structure ; covariance ; databases ; metabolites ; metabolomics ; models ; probability ; tandem mass spectrometry
    Language English
    Dates of publication 2018-0701
    Size p. i333-i340.
    Publishing place Oxford University Press
    Document type Article
    ZDB-ID 1468345-3
    ISSN 1460-2059 ; 1367-4811 ; 1367-4803
    ISSN (online) 1460-2059 ; 1367-4811
    ISSN 1367-4803
    DOI 10.1093/bioinformatics/bty245
    Database NAL-Catalogue (AGRICOLA)

    More links

    Kategorien

  8. Article ; Online: Bayesian networks for mass spectrometric metabolite identification via molecular fingerprints.

    Ludwig, Marcus / Dührkop, Kai / Böcker, Sebastian

    Bioinformatics (Oxford, England)

    2018  Volume 34, Issue 13, Page(s) i333–i340

    Abstract: Motivation: Metabolites, small molecules that are involved in cellular reactions, provide a direct functional signature of cellular state. Untargeted metabolomics experiments usually rely on tandem mass spectrometry to identify the thousands of ... ...

    Abstract Motivation: Metabolites, small molecules that are involved in cellular reactions, provide a direct functional signature of cellular state. Untargeted metabolomics experiments usually rely on tandem mass spectrometry to identify the thousands of compounds in a biological sample. Recently, we presented CSI:FingerID for searching in molecular structure databases using tandem mass spectrometry data. CSI:FingerID predicts a molecular fingerprint that encodes the structure of the query compound, then uses this to search a molecular structure database such as PubChem. Scoring of the predicted query fingerprint and deterministic target fingerprints is carried out assuming independence between the molecular properties constituting the fingerprint.
    Results: We present a scoring that takes into account dependencies between molecular properties. As before, we predict posterior probabilities of molecular properties using machine learning. Dependencies between molecular properties are modeled as a Bayesian tree network; the tree structure is estimated on the fly from the instance data. For each edge, we also estimate the expected covariance between the two random variables. For fixed marginal probabilities, we then estimate conditional probabilities using the known covariance. Now, the corrected posterior probability of each candidate can be computed, and candidates are ranked by this score. Modeling dependencies improves identification rates of CSI:FingerID by 2.85 percentage points.
    Availability and implementation: The new scoring Bayesian (fixed tree) is integrated into SIRIUS 4.0 (https://bio.informatik.uni-jena.de/software/sirius/).
    MeSH term(s) Bayes Theorem ; Databases, Chemical ; Machine Learning ; Metabolomics/methods ; Software ; Tandem Mass Spectrometry
    Language English
    Publishing date 2018-06-14
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 1422668-6
    ISSN 1367-4811 ; 1367-4803
    ISSN (online) 1367-4811
    ISSN 1367-4803
    DOI 10.1093/bioinformatics/bty245
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  9. Article ; Online: Mass Difference Matching Unfolds Hidden Molecular Structures of Dissolved Organic Matter.

    Simon, Carsten / Dührkop, Kai / Petras, Daniel / Roth, Vanessa-Nina / Böcker, Sebastian / Dorrestein, Pieter C / Gleixner, Gerd

    Environmental science & technology

    2022  Volume 56, Issue 15, Page(s) 11027–11040

    Abstract: Ultrahigh-resolution Fourier transform mass spectrometry (FTMS) has revealed unprecedented details of natural complex mixtures such as dissolved organic matter (DOM) on a molecular formula level, but we lack approaches to access the underlying structural ...

    Abstract Ultrahigh-resolution Fourier transform mass spectrometry (FTMS) has revealed unprecedented details of natural complex mixtures such as dissolved organic matter (DOM) on a molecular formula level, but we lack approaches to access the underlying structural complexity. We here explore the hypothesis that every DOM precursor ion is potentially linked with all emerging product ions in FTMS
    MeSH term(s) Complex Mixtures ; Dissolved Organic Matter ; Molecular Structure ; Spectrometry, Mass, Electrospray Ionization/methods
    Chemical Substances Complex Mixtures ; Dissolved Organic Matter
    Language English
    Publishing date 2022-07-14
    Publishing country United States
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ISSN 1520-5851
    ISSN (online) 1520-5851
    DOI 10.1021/acs.est.2c01332
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  10. Article ; Online: De Novo Molecular Formula Annotation and Structure Elucidation Using SIRIUS 4.

    Ludwig, Marcus / Fleischauer, Markus / Dührkop, Kai / Hoffmann, Martin A / Böcker, Sebastian

    Methods in molecular biology (Clifton, N.J.)

    2020  Volume 2104, Page(s) 185–207

    Abstract: SIRIUS 4 is the best-in-class computational tool for metabolite identification from high-resolution tandem mass spectrometry data. It offers de novo molecular formula annotation with outstanding accuracy. When searching fragmentation spectra in a ... ...

    Abstract SIRIUS 4 is the best-in-class computational tool for metabolite identification from high-resolution tandem mass spectrometry data. It offers de novo molecular formula annotation with outstanding accuracy. When searching fragmentation spectra in a structure database, it reaches over 70% correct identifications. A predicted fingerprint, which indicates the presence or absence of thousands of molecular properties, helps to deduce information about the compound of interest even if it is not contained in any structure database. Here, we present best practices and describe how to leverage the full potential of SIRIUS 4, how to incorporate it into your own workflow, and how it adds value to the analysis of mass spectrometry data beyond spectral library search.
    MeSH term(s) Chromatography, Liquid ; Computational Biology/methods ; Databases, Factual ; Humans ; Metabolomics/methods ; Molecular Structure ; Software ; Spectrometry, Mass, Electrospray Ionization ; Structure-Activity Relationship ; Tandem Mass Spectrometry ; User-Computer Interface ; Workflow
    Language English
    Publishing date 2020-01-17
    Publishing country United States
    Document type Journal Article ; Review
    ISSN 1940-6029
    ISSN (online) 1940-6029
    DOI 10.1007/978-1-0716-0239-3_11
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

To top