LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 240

Search options

  1. Article ; Online: panRGP: a pangenome-based method to predict genomic islands and explore their diversity.

    Bazin, Adelme / Gautreau, Guillaume / Médigue, Claudine / Vallenet, David / Calteau, Alexandra

    Bioinformatics (Oxford, England)

    2020  Volume 36, Issue Suppl_2, Page(s) i651–i658

    Abstract: Motivation: Horizontal gene transfer (HGT) is a major source of variability in prokaryotic genomes. Regions of genome plasticity (RGPs) are clusters of genes located in highly variable genomic regions. Most of them arise from HGT and correspond to ... ...

    Abstract Motivation: Horizontal gene transfer (HGT) is a major source of variability in prokaryotic genomes. Regions of genome plasticity (RGPs) are clusters of genes located in highly variable genomic regions. Most of them arise from HGT and correspond to genomic islands (GIs). The study of those regions at the species level has become increasingly difficult with the data deluge of genomes. To date, no methods are available to identify GIs using hundreds of genomes to explore their diversity.
    Results: We present here the panRGP method that predicts RGPs using pangenome graphs made of all available genomes for a given species. It allows the study of thousands of genomes in order to access the diversity of RGPs and to predict spots of insertions. It gave the best predictions when benchmarked along other GI detection tools against a reference dataset. In addition, we illustrated its use on metagenome assembled genomes by redefining the borders of the leuX tRNA hotspot, a well-studied spot of insertion in Escherichia coli. panRPG is a scalable and reliable tool to predict GIs and spots making it an ideal approach for large comparative studies.
    Availability and implementation: The methods presented in the current work are available through the following software: https://github.com/labgem/PPanGGOLiN. Detailed results and scripts to compute the benchmark metrics are available at https://github.com/axbazin/panrgp_supdata.
    MeSH term(s) Gene Transfer, Horizontal ; Genomic Islands/genetics ; Genomics ; Metagenome ; Software
    Language English
    Publishing date 2020-12-30
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 1422668-6
    ISSN 1367-4811 ; 1367-4803
    ISSN (online) 1367-4811
    ISSN 1367-4803
    DOI 10.1093/bioinformatics/btaa792
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  2. Article ; Online: GROOLS: reactive graph reasoning for genome annotation through biological processes.

    Mercier, Jonathan / Josso, Adrien / Médigue, Claudine / Vallenet, David

    BMC bioinformatics

    2018  Volume 19, Issue 1, Page(s) 132

    Abstract: Background: High quality functional annotation is essential for understanding the phenotypic consequences encoded in a genome. Despite improvements in bioinformatics methods, millions of sequences in databanks are not assigned reliable functions. The ... ...

    Abstract Background: High quality functional annotation is essential for understanding the phenotypic consequences encoded in a genome. Despite improvements in bioinformatics methods, millions of sequences in databanks are not assigned reliable functions. The curation of protein functions in the context of biological processes is a way to evaluate and improve their annotation.
    Results: We developed an expert system using paraconsistent logic, named GROOLS (Genomic Rule Object-Oriented Logic System), that evaluates the completeness and the consistency of predicted functions through biological processes like metabolic pathways. Using a generic and hierarchical representation of knowledge, biological processes are modeled in a graph from which observations (i.e. predictions and expectations) are propagated by rules. At the end of the reasoning, conclusions are assigned to biological process components and highlight uncertainties and inconsistencies. Results on 14 microbial organisms are presented.
    Conclusions: GROOLS software is designed to evaluate the overall accuracy of functional unit and pathway predictions according to organism experimental data like growth phenotypes. It assists biocurators in the functional annotation of proteins by focusing on missing or contradictory observations.
    MeSH term(s) Acinetobacter/genetics ; Algorithms ; Biological Phenomena ; Biosynthetic Pathways/genetics ; Computational Biology/methods ; Cysteine/biosynthesis ; Databases, Factual ; Genome ; Molecular Sequence Annotation ; Software
    Chemical Substances Cysteine (K848JZ4886)
    Language English
    Publishing date 2018-04-11
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 2041484-5
    ISSN 1471-2105 ; 1471-2105
    ISSN (online) 1471-2105
    ISSN 1471-2105
    DOI 10.1186/s12859-018-2126-1
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  3. Article ; Online: A model industrial workhorse: Bacillus subtilis strain 168 and its genome after a quarter of a century.

    Bremer, Erhard / Calteau, Alexandra / Danchin, Antoine / Harwood, Colin / Helmann, John D / Médigue, Claudine / Palsson, Bernhard O / Sekowska, Agnieszka / Vallenet, David / Zuniga, Abril / Zuniga, Cristal

    Microbial biotechnology

    2023  Volume 16, Issue 6, Page(s) 1203–1231

    Abstract: The vast majority of genomic sequences are automatically annotated using various software programs. The accuracy of these annotations depends heavily on the very few manual annotation efforts that combine verified experimental data with genomic sequences ...

    Abstract The vast majority of genomic sequences are automatically annotated using various software programs. The accuracy of these annotations depends heavily on the very few manual annotation efforts that combine verified experimental data with genomic sequences from model organisms. Here, we summarize the updated functional annotation of Bacillus subtilis strain 168, a quarter century after its genome sequence was first made public. Since the last such effort 5 years ago, 1168 genetic functions have been updated, allowing the construction of a new metabolic model of this organism of environmental and industrial interest. The emphasis in this review is on new metabolic insights, the role of metals in metabolism and macromolecule biosynthesis, functions involved in biofilm formation, features controlling cell growth, and finally, protein agents that allow class discrimination, thus allowing maintenance management, and accuracy of all cell processes. New 'genomic objects' and an extensive updated literature review have been included for the sequence, now available at the International Nucleotide Sequence Database Collaboration (INSDC: AccNum AL009126.4).
    MeSH term(s) Bacillus subtilis/genetics ; Bacillus subtilis/metabolism ; Genomics ; Genome, Bacterial
    Language English
    Publishing date 2023-04-01
    Publishing country United States
    Document type Journal Article ; Review ; Research Support, Non-U.S. Gov't ; Research Support, N.I.H., Extramural
    ZDB-ID 2406063-X
    ISSN 1751-7915 ; 1751-7915
    ISSN (online) 1751-7915
    ISSN 1751-7915
    DOI 10.1111/1751-7915.14257
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  4. Article ; Online: Ancestral state reconstruction of metabolic pathways across pangenome ensembles.

    Psomopoulos, Fotis E / van Helden, Jacques / Médigue, Claudine / Chasapi, Anastasia / Ouzounis, Christos A

    Microbial genomics

    2020  Volume 6, Issue 11

    Abstract: As genome sequencing efforts are unveiling the genetic diversity of the biosphere with an unprecedented speed, there is a need to accurately describe the structural and functional properties of groups of extant species whose genomes have been sequenced, ... ...

    Abstract As genome sequencing efforts are unveiling the genetic diversity of the biosphere with an unprecedented speed, there is a need to accurately describe the structural and functional properties of groups of extant species whose genomes have been sequenced, as well as their inferred ancestors, at any given taxonomic level of their phylogeny. Elaborate approaches for the reconstruction of ancestral states at the sequence level have been developed, subsequently augmented by methods based on gene content. While these approaches of sequence or gene-content reconstruction have been successfully deployed, there has been less progress on the explicit inference of functional properties of ancestral genomes, in terms of metabolic pathways and other cellular processes. Herein, we describe PathTrace, an efficient algorithm for parsimony-based reconstructions of the evolutionary history of individual metabolic pathways, pivotal representations of key functional modules of cellular function. The algorithm is implemented as a five-step process through which pathways are represented as fuzzy vectors, where each enzyme is associated with a taxonomic conservation value derived from the phylogenetic profile of its protein sequence. The method is evaluated with a selected benchmark set of pathways against collections of genome sequences from key data resources. By deploying a pangenome-driven approach for pathway sets, we demonstrate that the inferred patterns are largely insensitive to noise, as opposed to gene-content reconstruction methods. In addition, the resulting reconstructions are closely correlated with the evolutionary distance of the taxa under study, suggesting that a diligent selection of target pangenomes is essential for maintaining cohesiveness of the method and consistency of the inference, serving as an internal control for an arbitrary selection of queries. The PathTrace method is a first step towards the large-scale analysis of metabolic pathway evolution and our deeper understanding of functional relationships reflected in emerging pangenome collections.
    MeSH term(s) Algorithms ; Amino Acid Sequence ; Bacteria/genetics ; Bacteria/metabolism ; Base Sequence ; Evolution, Molecular ; Genome/genetics ; Metabolic Networks and Pathways/genetics ; Phylogeny ; Software
    Language English
    Publishing date 2020-09-14
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 2835258-0
    ISSN 2057-5858 ; 2057-5858
    ISSN (online) 2057-5858
    ISSN 2057-5858
    DOI 10.1099/mgen.0.000429
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  5. Article ; Online: Correction: PPanGGOLiN: Depicting microbial diversity via a partitioned pangenome graph.

    Gautreau, Guillaume / Bazin, Adelme / Gachet, Mathieu / Planel, Rémi / Burlot, Laura / Dubois, Mathieu / Perrin, Amandine / Médigue, Claudine / Calteau, Alexandra / Cruveiller, Stéphane / Matias, Catherine / Ambroise, Christophe / Rocha, Eduardo P C / Vallenet, David

    PLoS computational biology

    2021  Volume 17, Issue 12, Page(s) e1009687

    Abstract: This corrects the article DOI: 10.1371/journal.pcbi.1007732.]. ...

    Abstract [This corrects the article DOI: 10.1371/journal.pcbi.1007732.].
    Language English
    Publishing date 2021-12-10
    Publishing country United States
    Document type Published Erratum
    ZDB-ID 2193340-6
    ISSN 1553-7358 ; 1553-734X
    ISSN (online) 1553-7358
    ISSN 1553-734X
    DOI 10.1371/journal.pcbi.1009687
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  6. Article ; Online: Specialization of small non-conjugative plasmids in

    Branger, Catherine / Ledda, Alice / Billard-Pomares, Typhaine / Doublet, Benoît / Barbe, Valérie / Roche, David / Médigue, Claudine / Arlet, Guillaume / Denamur, Erick

    Microbial genomics

    2019  Volume 5, Issue 9

    Abstract: We undertook a comprehensive comparative analysis of a collection of 30 small (<25 kb) non- ... ...

    Abstract We undertook a comprehensive comparative analysis of a collection of 30 small (<25 kb) non-conjugative
    MeSH term(s) Databases, Genetic ; Escherichia coli/genetics ; Evolution, Molecular ; Gene Frequency ; Phylogeny ; Plasmids/classification ; Plasmids/genetics ; Plasmids/metabolism ; Species Specificity
    Language English
    Publishing date 2019-08-05
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 2835258-0
    ISSN 2057-5858 ; 2057-5858
    ISSN (online) 2057-5858
    ISSN 2057-5858
    DOI 10.1099/mgen.0.000281
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  7. Article ; Online: PlaScope: a targeted approach to assess the plasmidome from genome assemblies at the species level.

    Royer, G / Decousser, J W / Branger, C / Dubois, M / Médigue, C / Denamur, E / Vallenet, D

    Microbial genomics

    2018  Volume 4, Issue 9

    Abstract: Plasmid prediction may be of great interest when studying bacteria of medical importance such as Enterobacteriaceae as well as Staphylococcus aureus or Enterococcus. Indeed, many resistance and virulence genes are located on such replicons with major ... ...

    Abstract Plasmid prediction may be of great interest when studying bacteria of medical importance such as Enterobacteriaceae as well as Staphylococcus aureus or Enterococcus. Indeed, many resistance and virulence genes are located on such replicons with major impact in terms of pathogenicity and spreading capacities. Beyond strain outbreak, plasmid outbreaks have been reported in particular for some extended-spectrum beta-lactamase- or carbapenemase-producing Enterobacteriaceae. Several tools are now available to explore the 'plasmidome' from whole-genome sequences with various approaches, but none of them are able to combine high sensitivity and specificity. With this in mind, we developed PlaScope, a targeted approach to recover plasmidic sequences in genome assemblies at the species or genus level. Based on Centrifuge, a metagenomic classifier, and a custom database containing complete sequences of chromosomes and plasmids from various curated databases, PlaScope classifies contigs from an assembly according to their predicted location. Compared to other plasmid classifiers, PlasFlow and cBar, it achieves better recall (0.87), specificity (0.99), precision (0.96) and accuracy (0.98) on a dataset of 70 genomes of Escherichia coli containing plasmids. In a second part, we identified 20 of the 21 chromosomal integrations of the extended-spectrum beta-lactamase coding gene in a clinical dataset of E. coli strains. In addition, we predicted virulence gene and operon locations in agreement with the literature. We also built a database for Klebsiella and correctly assigned the location for the majority of resistance genes from a collection of 12 Klebsiella pneumoniae strains. Similar approaches could also be developed for other well-characterized bacteria.
    MeSH term(s) Chromosomes, Bacterial ; Drug Resistance, Bacterial/genetics ; Escherichia coli/genetics ; Genome, Bacterial ; Klebsiella pneumoniae/genetics ; Operon ; Plasmids/genetics ; Software ; Virulence Factors/genetics ; Whole Genome Sequencing ; Workflow
    Chemical Substances Virulence Factors
    Language English
    Publishing date 2018-09-28
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 2835258-0
    ISSN 2057-5858 ; 2057-5858
    ISSN (online) 2057-5858
    ISSN 2057-5858
    DOI 10.1099/mgen.0.000211
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  8. Article ; Online: PPanGGOLiN: Depicting microbial diversity via a partitioned pangenome graph.

    Gautreau, Guillaume / Bazin, Adelme / Gachet, Mathieu / Planel, Rémi / Burlot, Laura / Dubois, Mathieu / Perrin, Amandine / Médigue, Claudine / Calteau, Alexandra / Cruveiller, Stéphane / Matias, Catherine / Ambroise, Christophe / Rocha, Eduardo P C / Vallenet, David

    PLoS computational biology

    2020  Volume 16, Issue 3, Page(s) e1007732

    Abstract: The use of comparative genomics for functional, evolutionary, and epidemiological studies requires methods to classify gene families in terms of occurrence in a given species. These methods usually lack multivariate statistical models to infer the ... ...

    Abstract The use of comparative genomics for functional, evolutionary, and epidemiological studies requires methods to classify gene families in terms of occurrence in a given species. These methods usually lack multivariate statistical models to infer the partitions and the optimal number of classes and don't account for genome organization. We introduce a graph structure to model pangenomes in which nodes represent gene families and edges represent genomic neighborhood. Our method, named PPanGGOLiN, partitions nodes using an Expectation-Maximization algorithm based on multivariate Bernoulli Mixture Model coupled with a Markov Random Field. This approach takes into account the topology of the graph and the presence/absence of genes in pangenomes to classify gene families into persistent, cloud, and one or several shell partitions. By analyzing the partitioned pangenome graphs of isolate genomes from 439 species and metagenome-assembled genomes from 78 species, we demonstrate that our method is effective in estimating the persistent genome. Interestingly, it shows that the shell genome is a key element to understand genome dynamics, presumably because it reflects how genes present at intermediate frequencies drive adaptation of species, and its proportion in genomes is independent of genome size. The graph-based approach proposed by PPanGGOLiN is useful to depict the overall genomic diversity of thousands of strains in a compact structure and provides an effective basis for very large scale comparative genomics. The software is freely available at https://github.com/labgem/PPanGGOLiN.
    MeSH term(s) Algorithms ; Bacteria/classification ; Bacteria/genetics ; Genome, Bacterial/genetics ; Genomics/methods ; Multivariate Analysis ; Software
    Language English
    Publishing date 2020-03-19
    Publishing country United States
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 2193340-6
    ISSN 1553-7358 ; 1553-734X
    ISSN (online) 1553-7358
    ISSN 1553-734X
    DOI 10.1371/journal.pcbi.1007732
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  9. Article ; Online: Distinct co-evolution patterns of genes associated to DNA polymerase III DnaE and PolC.

    Engelen, Stefan / Vallenet, David / Médigue, Claudine / Danchin, Antoine

    BMC genomics

    2012  Volume 13, Page(s) 69

    Abstract: Background: Bacterial genomes displaying a strong bias between the leading and the lagging strand of DNA replication encode two DNA polymerases III, DnaE and PolC, rather than a single one. Replication is a highly unsymmetrical process, and the presence ...

    Abstract Background: Bacterial genomes displaying a strong bias between the leading and the lagging strand of DNA replication encode two DNA polymerases III, DnaE and PolC, rather than a single one. Replication is a highly unsymmetrical process, and the presence of two polymerases is therefore not unexpected. Using comparative genomics, we explored whether other processes have evolved in parallel with each polymerase.
    Results: Extending previous in silico heuristics for the analysis of gene co-evolution, we analyzed the function of genes clustering with dnaE and polC. Clusters were highly informative. DnaE co-evolves with the ribosome, the transcription machinery, the core of intermediary metabolism enzymes. It is also connected to the energy-saving enzyme necessary for RNA degradation, polynucleotide phosphorylase. Most of the proteins of this co-evolving set belong to the persistent set in bacterial proteomes, that is fairly ubiquitously distributed. In contrast, PolC co-evolves with RNA degradation enzymes that are present only in the A+T-rich Firmicutes clade, suggesting at least two origins for the degradosome.
    Conclusion: DNA replication involves two machineries, DnaE and PolC. DnaE co-evolves with the core functions of bacterial life. In contrast PolC co-evolves with a set of RNA degradation enzymes that does not derive from the degradosome identified in gamma-Proteobacteria. This suggests that at least two independent RNA degradation pathways existed in the progenote community at the end of the RNA genome world.
    MeSH term(s) Bacteria/enzymology ; Bacteria/genetics ; Bacterial Proteins/genetics ; DNA Polymerase III/genetics ; DNA-Directed DNA Polymerase/genetics ; Endoribonucleases/genetics ; Evolution, Molecular ; Genes, Bacterial/genetics ; Genomics ; Multienzyme Complexes/genetics ; Phylogeny ; Polyribonucleotide Nucleotidyltransferase/genetics ; RNA Helicases/genetics
    Chemical Substances Bacterial Proteins ; Multienzyme Complexes ; degradosome ; DNA polymerase III, alpha subunit (EC 2.7.7.-) ; PolC protein, bacteria (EC 2.7.7.-) ; DNA Polymerase III (EC 2.7.7.7) ; DNA-Directed DNA Polymerase (EC 2.7.7.7) ; Polyribonucleotide Nucleotidyltransferase (EC 2.7.7.8) ; Endoribonucleases (EC 3.1.-) ; RNA Helicases (EC 3.6.4.13)
    Language English
    Publishing date 2012-02-14
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ISSN 1471-2164
    ISSN (online) 1471-2164
    DOI 10.1186/1471-2164-13-69
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  10. Article ; Online: Profiling the orphan enzymes.

    Sorokina, Maria / Stam, Mark / Médigue, Claudine / Lespinet, Olivier / Vallenet, David

    Biology direct

    2014  Volume 9, Page(s) 10

    Abstract: The emergence of Next Generation Sequencing generates an incredible amount of sequence and great potential for new enzyme discovery. Despite this huge amount of data and the profusion of bioinformatic methods for function prediction, a large part of ... ...

    Abstract The emergence of Next Generation Sequencing generates an incredible amount of sequence and great potential for new enzyme discovery. Despite this huge amount of data and the profusion of bioinformatic methods for function prediction, a large part of known enzyme activities is still lacking an associated protein sequence. These particular activities are called "orphan enzymes". The present review proposes an update of previous surveys on orphan enzymes by mining the current content of public databases. While the percentage of orphan enzyme activities has decreased from 38% to 22% in ten years, there are still more than 1,000 orphans among the 5,000 entries of the Enzyme Commission (EC) classification. Taking into account all the reactions present in metabolic databases, this proportion dramatically increases to reach nearly 50% of orphans and many of them are not associated to a known pathway. We extended our survey to "local orphan enzymes" that are activities which have no representative sequence in a given clade, but have at least one in organisms belonging to other clades. We observe an important bias in Archaea and find that in general more than 30% of the EC activities have incomplete sequence information in at least one superkingdom. To estimate if candidate proteins for local orphans could be retrieved by homology search, we applied a simple strategy based on the PRIAM software and noticed that candidates may be proposed for an important fraction of local orphan enzymes. Finally, by studying relation between protein domains and catalyzed activities, it appears that newly discovered enzymes are mostly associated with already known enzyme domains. Thus, the exploration of the promiscuity and the multifunctional aspect of known enzyme families may solve part of the orphan enzyme issue. We conclude this review with a presentation of recent initiatives in finding proteins for orphan enzymes and in extending the enzyme world by the discovery of new activities.
    MeSH term(s) Archaea/genetics ; Archaea/metabolism ; Bacteria/genetics ; Bacteria/metabolism ; Databases, Protein ; Enzymes/classification ; Enzymes/genetics ; Enzymes/metabolism ; Eukaryota/genetics ; Eukaryota/metabolism ; Genomics/methods ; High-Throughput Nucleotide Sequencing ; Phylogeny ; Proteins/classification ; Proteins/genetics ; Proteins/metabolism ; Proteomics/methods ; Sequence Analysis, Protein
    Chemical Substances Enzymes ; Proteins
    Language English
    Publishing date 2014-06-06
    Publishing country England
    Document type Journal Article ; Review
    ISSN 1745-6150
    ISSN (online) 1745-6150
    DOI 10.1186/1745-6150-9-10
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

To top