LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 14

Search options

  1. Article ; Online: Improved measures for evolutionary conservation that exploit taxonomy distances.

    Malhis, Nawar / Jones, Steven J M / Gsponer, Jörg

    Nature communications

    2019  Volume 10, Issue 1, Page(s) 1556

    Abstract: Selective pressures on protein-coding regions that provide fitness advantages can lead to the regions' fixation and conservation in genome duplications and speciation events. Consequently, conservation analyses relying on sequence similarities are ... ...

    Abstract Selective pressures on protein-coding regions that provide fitness advantages can lead to the regions' fixation and conservation in genome duplications and speciation events. Consequently, conservation analyses relying on sequence similarities are exploited by a myriad of applications across all biosciences to identify functionally important protein regions. While very potent, existing conservation measures based on multiple sequence alignments are so pervasive that improvements to solutions of many problems have become incremental. We introduce a new framework for evolutionary conservation with measures that exploit taxonomy distances across species. Results show that our taxonomy-based framework comfortably outperforms existing conservation measures in identifying deleterious variants observed in the human population, including variants located in non-abundant sequence domains such as intrinsically disordered regions. The predictive power of our approach emphasizes that the phenotypic effects of sequence variants can be taxonomy-level specific and thus, conservation needs to be interpreted accordingly.
    MeSH term(s) Classification/methods ; Evolution, Molecular ; Genetic Variation ; Humans ; Proteins/chemistry ; Proteins/genetics ; Sequence Alignment ; Sequence Analysis, Protein
    Chemical Substances Proteins
    Language English
    Publishing date 2019-04-05
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 2553671-0
    ISSN 2041-1723 ; 2041-1723
    ISSN (online) 2041-1723
    ISSN 2041-1723
    DOI 10.1038/s41467-019-09583-2
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  2. Article ; Online: LIST-S2: taxonomy based sorting of deleterious missense mutations across species.

    Malhis, Nawar / Jacobson, Matthew / Jones, Steven J M / Gsponer, Jörg

    Nucleic acids research

    2020  Volume 48, Issue W1, Page(s) W154–W161

    Abstract: The separation of deleterious from benign mutations remains a key challenge in the interpretation of genomic data. Computational methods used to sort mutations based on their potential deleteriousness rely largely on conservation measures derived from ... ...

    Abstract The separation of deleterious from benign mutations remains a key challenge in the interpretation of genomic data. Computational methods used to sort mutations based on their potential deleteriousness rely largely on conservation measures derived from sequence alignments. Here, we introduce LIST-S2, a successor to our previously developed approach LIST, which aims to exploit local sequence identity and taxonomy distances in quantifying the conservation of human protein sequences. Unlike its predecessor, LIST-S2 is not limited to human sequences but can assess conservation and make predictions for sequences from any organism. Moreover, we provide a web-tool and downloadable software to compute and visualize the deleteriousness of mutations in user-provided sequences. This web-tool contains an HTML interface and a RESTful API to submit and manage sequences as well as a browsable set of precomputed predictions for a large number of UniProtKB protein sequences of common taxa. LIST-S2 is available at: https://list-s2.msl.ubc.ca/.
    MeSH term(s) Animals ; Germ-Line Mutation ; Humans ; Mutation, Missense ; Neoplasms/genetics ; Sequence Analysis, Protein ; Software
    Language English
    Publishing date 2020-04-30
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 186809-3
    ISSN 1362-4962 ; 1362-4954 ; 0301-5610 ; 0305-1048
    ISSN (online) 1362-4962 ; 1362-4954
    ISSN 0301-5610 ; 0305-1048
    DOI 10.1093/nar/gkaa288
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  3. Article ; Online: Computational identification of MoRFs in protein sequences.

    Malhis, Nawar / Gsponer, Jörg

    Bioinformatics (Oxford, England)

    2015  Volume 31, Issue 11, Page(s) 1738–1744

    Abstract: Motivation: Intrinsically disordered regions of proteins play an essential role in the regulation of various biological processes. Key to their regulatory function is the binding of molecular recognition features (MoRFs) to globular protein domains in a ...

    Abstract Motivation: Intrinsically disordered regions of proteins play an essential role in the regulation of various biological processes. Key to their regulatory function is the binding of molecular recognition features (MoRFs) to globular protein domains in a process known as a disorder-to-order transition. Predicting the location of MoRFs in protein sequences with high accuracy remains an important computational challenge.
    Method: In this study, we introduce MoRFCHiBi, a new computational approach for fast and accurate prediction of MoRFs in protein sequences. MoRFCHiBi combines the outcomes of two support vector machine (SVM) models that take advantage of two different kernels with high noise tolerance. The first, SVMS, is designed to extract maximal information from the general contrast in amino acid compositions between MoRFs, their surrounding regions (Flanks), and the remainders of the sequences. The second, SVMT, is used to identify similarities between regions in a query sequence and MoRFs of the training set.
    Results: We evaluated the performance of our predictor by comparing its results with those of two currently available MoRF predictors, MoRFpred and ANCHOR. Using three test sets that have previously been collected and used to evaluate MoRFpred and ANCHOR, we demonstrate that MoRFCHiBi outperforms the other predictors with respect to different evaluation metrics. In addition, MoRFCHiBi is downloadable and fast, which makes it useful as a component in other computational prediction tools.
    Availability and implementation: http://www.chibi.ubc.ca/morf/.
    MeSH term(s) Algorithms ; Amino Acids ; Computational Biology/methods ; Intrinsically Disordered Proteins/chemistry ; Protein Structure, Tertiary ; Sequence Analysis, Protein/methods ; Software ; Support Vector Machine
    Chemical Substances Amino Acids ; Intrinsically Disordered Proteins
    Language English
    Publishing date 2015-06-01
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 1422668-6
    ISSN 1367-4811 ; 1367-4803
    ISSN (online) 1367-4811
    ISSN 1367-4803
    DOI 10.1093/bioinformatics/btv060
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  4. Article ; Online: Tutorial: a guide for the selection of fast and accurate computational tools for the prediction of intrinsic disorder in proteins.

    Kurgan, Lukasz / Hu, Gang / Wang, Kui / Ghadermarzi, Sina / Zhao, Bi / Malhis, Nawar / Erdős, Gábor / Gsponer, Jörg / Uversky, Vladimir N / Dosztányi, Zsuzsanna

    Nature protocols

    2023  Volume 18, Issue 11, Page(s) 3157–3172

    Abstract: Intrinsic disorder is instrumental for a wide range of protein functions, and its analysis, using computational predictions from primary structures, complements secondary and tertiary structure-based approaches. In this Tutorial, we provide an overview ... ...

    Abstract Intrinsic disorder is instrumental for a wide range of protein functions, and its analysis, using computational predictions from primary structures, complements secondary and tertiary structure-based approaches. In this Tutorial, we provide an overview and comparison of 23 publicly available computational tools with complementary parameters useful for intrinsic disorder prediction, partly relying on results from the Critical Assessment of protein Intrinsic Disorder prediction experiment. We consider factors such as accuracy, runtime, availability and the need for functional insights. The selected tools are available as web servers and downloadable programs, offer state-of-the-art predictions and can be used in a high-throughput manner. We provide examples and instructions for the selected tools to illustrate practical aspects related to the submission, collection and interpretation of predictions, as well as the timing and their limitations. We highlight two predictors for intrinsically disordered proteins, flDPnn as accurate and fast and IUPred as very fast and moderately accurate, while suggesting ANCHOR2 and MoRFchibi as two of the best-performing predictors for intrinsically disordered region binding. We link these tools to additional resources, including databases of predictions and web servers that integrate multiple predictive methods. Altogether, this Tutorial provides a hands-on guide to comparatively evaluating multiple predictors, submitting and collecting their own predictions, and reading and interpreting results. It is suitable for experimentalists and computational biologists interested in accurately and conveniently identifying intrinsic disorder, facilitating the functional characterization of the rapidly growing collections of protein sequences.
    MeSH term(s) Computational Biology/methods ; Databases, Protein ; Intrinsically Disordered Proteins/chemistry ; Amino Acid Sequence
    Chemical Substances Intrinsically Disordered Proteins
    Language English
    Publishing date 2023-09-22
    Publishing country England
    Document type Journal Article ; Review
    ZDB-ID 2244966-8
    ISSN 1750-2799 ; 1754-2189
    ISSN (online) 1750-2799
    ISSN 1754-2189
    DOI 10.1038/s41596-023-00876-x
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  5. Article ; Online: DescribePROT in 2023: more, higher-quality and experimental annotations and improved data download options.

    Basu, Sushmita / Zhao, Bi / Biró, Bálint / Faraggi, Eshel / Gsponer, Jörg / Hu, Gang / Kloczkowski, Andrzej / Malhis, Nawar / Mirdita, Milot / Söding, Johannes / Steinegger, Martin / Wang, Duolin / Wang, Kui / Xu, Dong / Zhang, Jian / Kurgan, Lukasz

    Nucleic acids research

    2023  Volume 52, Issue D1, Page(s) D426–D433

    Abstract: The DescribePROT database of amino acid-level descriptors of protein structures and functions was substantially expanded since its release in 2020. This expansion includes substantial increase in the size, scope, and quality of the underlying data, the ... ...

    Abstract The DescribePROT database of amino acid-level descriptors of protein structures and functions was substantially expanded since its release in 2020. This expansion includes substantial increase in the size, scope, and quality of the underlying data, the addition of experimental structural information, the inclusion of new data download options, and an upgraded graphical interface. DescribePROT currently covers 19 structural and functional descriptors for proteins in 273 reference proteomes generated by 11 accurate and complementary predictive tools. Users can search our resource in multiple ways, interact with the data using the graphical interface, and download data at various scales including individual proteins, entire proteomes, and whole database. The annotations in DescribePROT are useful for a broad spectrum of studies that include investigations of protein structure and function, development and validation of predictive tools, and to support efforts in understanding molecular underpinnings of diseases and development of therapeutics. DescribePROT can be freely accessed at http://biomine.cs.vcu.edu/servers/DESCRIBEPROT/.
    MeSH term(s) Proteome/chemistry ; Databases, Factual ; Amino Acids
    Chemical Substances Proteome ; Amino Acids
    Language English
    Publishing date 2023-11-07
    Publishing country England
    Document type Journal Article
    ZDB-ID 186809-3
    ISSN 1362-4962 ; 1362-4954 ; 0301-5610 ; 0305-1048
    ISSN (online) 1362-4962 ; 1362-4954
    ISSN 0301-5610 ; 0305-1048
    DOI 10.1093/nar/gkad985
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  6. Article ; Online: MoRFchibi SYSTEM: software tools for the identification of MoRFs in protein sequences.

    Malhis, Nawar / Jacobson, Matthew / Gsponer, Jörg

    Nucleic acids research

    2016  Volume 44, Issue W1, Page(s) W488–93

    Abstract: Molecular recognition features, MoRFs, are short segments within longer disordered protein regions that bind to globular protein domains in a process known as disorder-to-order transition. MoRFs have been found to play a significant role in signaling and ...

    Abstract Molecular recognition features, MoRFs, are short segments within longer disordered protein regions that bind to globular protein domains in a process known as disorder-to-order transition. MoRFs have been found to play a significant role in signaling and regulatory processes in cells. High-confidence computational identification of MoRFs remains an important challenge. In this work, we introduce MoRFchibi SYSTEM that contains three MoRF predictors: MoRFCHiBi, a basic predictor best suited as a component in other applications, MoRFCHiBi_ Light, ideal for high-throughput predictions and MoRFCHiBi_ Web, slower than the other two but best for high accuracy predictions. Results show that MoRFchibi SYSTEM provides more than double the precision of other predictors. MoRFchibi SYSTEM is available in three different forms: as HTML web server, RESTful web server and downloadable software at: http://www.chibi.ubc.ca/faculty/joerg-gsponer/gsponer-lab/software/morf_chibi/.
    MeSH term(s) Amino Acid Sequence ; Benchmarking ; CD3 Complex/chemistry ; CD3 Complex/metabolism ; Datasets as Topic ; High-Throughput Screening Assays ; Humans ; Internet ; Protein Binding ; Proteins/chemistry ; Proteins/metabolism ; Software
    Chemical Substances CD3 Complex ; CD3E protein, human ; Proteins
    Language English
    Publishing date 2016-05-12
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 186809-3
    ISSN 1362-4962 ; 1362-4954 ; 0301-5610 ; 0305-1048
    ISSN (online) 1362-4962 ; 1362-4954
    ISSN 0301-5610 ; 0305-1048
    DOI 10.1093/nar/gkw409
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  7. Article ; Online: Computational Disorder Analysis in Ethylene Response Factors Uncovers Binding Motifs Critical to Their Diverse Functions.

    Sun, Xiaolin / Malhis, Nawar / Zhao, Bi / Xue, Bin / Gsponer, Joerg / Rikkerink, Erik H A

    International journal of molecular sciences

    2019  Volume 21, Issue 1

    Abstract: APETALA2/ETHYLENE RESPONSE FACTOR transcription factors (AP2/ERFs) play crucial roles in adaptation to stresses such as those caused by pathogens, wounding and cold. Although their name suggests a specific role in ethylene signalling, some ERF members ... ...

    Abstract APETALA2/ETHYLENE RESPONSE FACTOR transcription factors (AP2/ERFs) play crucial roles in adaptation to stresses such as those caused by pathogens, wounding and cold. Although their name suggests a specific role in ethylene signalling, some ERF members also co-ordinate signals regulated by other key plant stress hormones such as jasmonate, abscisic acid and salicylate. We analysed a set of ERF proteins from three divergent plant species for intrinsically disorder regions containing conserved segments involved in protein-protein interaction known as Molecular Recognition Features (MoRFs). Then we correlated the MoRFs identified with a number of known functional features where these could be identified. Our analyses suggest that MoRFs, with plasticity in their disordered surroundings, are highly functional and may have been shuffled between related protein families driven by selection. A particularly important role may be played by the alpha helical component of the structured DNA binding domain to permit specificity. We also present examples of computationally identified MoRFs that have no known function and provide a valuable conceptual framework to link both disordered and ordered structural features within this family to diverse function.
    MeSH term(s) Amino Acid Sequence ; Ethylenes/metabolism ; Gene Expression Regulation, Plant ; Models, Molecular ; Phylogeny ; Plant Growth Regulators/metabolism ; Plant Proteins/chemistry ; Plant Proteins/genetics ; Plant Proteins/metabolism ; Plants/chemistry ; Plants/genetics ; Plants/metabolism ; Protein Interaction Domains and Motifs ; Protein Interaction Maps ; Stress, Physiological ; Transcription Factors/chemistry ; Transcription Factors/genetics ; Transcription Factors/metabolism
    Chemical Substances Ethylenes ; Plant Growth Regulators ; Plant Proteins ; Transcription Factors ; ethylene (91GW059KN7)
    Language English
    Publishing date 2019-12-20
    Publishing country Switzerland
    Document type Journal Article
    ZDB-ID 2019364-6
    ISSN 1422-0067 ; 1422-0067 ; 1661-6596
    ISSN (online) 1422-0067
    ISSN 1422-0067 ; 1661-6596
    DOI 10.3390/ijms21010074
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  8. Article ; Online: Protein-Protein Interactions Mediated by Intrinsically Disordered Protein Regions Are Enriched in Missense Mutations.

    Wong, Eric T C / So, Victor / Guron, Mike / Kuechler, Erich R / Malhis, Nawar / Bui, Jennifer M / Gsponer, Jörg

    Biomolecules

    2020  Volume 10, Issue 8

    Abstract: Because proteins are fundamental to most biological processes, many genetic diseases can be traced back to single nucleotide variants (SNVs) that cause changes in protein sequences. However, not all SNVs that result in amino acid substitutions cause ... ...

    Abstract Because proteins are fundamental to most biological processes, many genetic diseases can be traced back to single nucleotide variants (SNVs) that cause changes in protein sequences. However, not all SNVs that result in amino acid substitutions cause disease as each residue is under different structural and functional constraints. Influential studies have shown that protein-protein interaction interfaces are enriched in disease-associated SNVs and depleted in SNVs that are common in the general population. These studies focus primarily on folded (globular) protein domains and overlook the prevalent class of protein interactions mediated by intrinsically disordered regions (IDRs). Therefore, we investigated the enrichment patterns of missense mutation-causing SNVs that are associated with disease and cancer, as well as those present in the healthy population, in structures of IDR-mediated interactions with comparisons to classical globular interactions. When comparing the different categories of interaction interfaces, division of the interface regions into solvent-exposed rim residues and buried core residues reveal distinctive enrichment patterns for the various types of missense mutations. Most notably, we demonstrate a strong enrichment at the interface core of interacting IDRs in disease mutations and its depletion in neutral ones, which supports the view that the disruption of IDR interactions is a mechanism underlying many diseases. Intriguingly, we also found an asymmetry across the IDR interaction interface in the enrichment of certain missense mutation types, which may hint at an increased variant tolerance and urges further investigations of IDR interactions.
    MeSH term(s) Algorithms ; Databases, Protein ; Humans ; Intrinsically Disordered Proteins/chemistry ; Intrinsically Disordered Proteins/genetics ; Intrinsically Disordered Proteins/metabolism ; Models, Molecular ; Mutation, Missense ; Polymorphism, Single Nucleotide ; Protein Binding ; Protein Domains
    Chemical Substances Intrinsically Disordered Proteins
    Language English
    Publishing date 2020-07-24
    Publishing country Switzerland
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 2701262-1
    ISSN 2218-273X ; 2218-273X
    ISSN (online) 2218-273X
    ISSN 2218-273X
    DOI 10.3390/biom10081097
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  9. Article ; Online: High quality SNP calling using Illumina data at shallow coverage.

    Malhis, Nawar / Jones, Steven J M

    Bioinformatics (Oxford, England)

    2010  Volume 26, Issue 8, Page(s) 1029–1035

    Abstract: Motivation: Detection of single nucleotide polymorphisms (SNPs) has been a major application in processing second generation sequencing (SGS) data. In principle, SNPs are called on single base differences between a reference genome and a sequence ... ...

    Abstract Motivation: Detection of single nucleotide polymorphisms (SNPs) has been a major application in processing second generation sequencing (SGS) data. In principle, SNPs are called on single base differences between a reference genome and a sequence generated from SGS short reads of a sample genome. However, this exercise is far from trivial; several parameters related to sequencing quality, and/or reference genome properties, play essential effect on the accuracy of called SNPs especially at shallow coverage data. In this work, we present Slider II, an alignment and SNP calling approach that demonstrates improved algorithmic approaches enabling larger number of called SNPs with lower false positive rate. In addition to the regular alignment and SNP calling, as an optional feature, Slider II is capable of utilizing information about known SNPs of a target genome, as priors, in the alignment and SNPs calling to enhance it's capability of detecting these known SNPs and novel SNPs and mutations in their vicinity.
    MeSH term(s) Algorithms ; Databases, Genetic ; Genome ; Polymorphism, Single Nucleotide ; Sequence Alignment/methods ; Sequence Analysis, DNA
    Language English
    Publishing date 2010-04-15
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 1422668-6
    ISSN 1367-4811 ; 1367-4803
    ISSN (online) 1367-4811
    ISSN 1367-4803
    DOI 10.1093/bioinformatics/btq092
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  10. Article ; Online: Computational Identification of MoRFs in Protein Sequences Using Hierarchical Application of Bayes Rule.

    Malhis, Nawar / Wong, Eric T C / Nassar, Roy / Gsponer, Jörg

    PloS one

    2015  Volume 10, Issue 10, Page(s) e0141603

    Abstract: Motivation: Intrinsically disordered regions of proteins play an essential role in the regulation of various biological processes. Key to their regulatory function is often the binding to globular protein domains via sequence elements known as molecular ...

    Abstract Motivation: Intrinsically disordered regions of proteins play an essential role in the regulation of various biological processes. Key to their regulatory function is often the binding to globular protein domains via sequence elements known as molecular recognition features (MoRFs). Development of computational tools for the identification of candidate MoRF locations in amino acid sequences is an important task and an area of growing interest. Given the relative sparseness of MoRFs in protein sequences, the accuracy of the available MoRF predictors is often inadequate for practical usage, which leaves a significant need and room for improvement. In this work, we introduce MoRFCHiBi_Web, which predicts MoRF locations in protein sequences with higher accuracy compared to current MoRF predictors.
    Methods: Three distinct and largely independent property scores are computed with component predictors and then combined to generate the final MoRF propensity scores. The first score reflects the likelihood of sequence windows to harbour MoRFs and is based on amino acid composition and sequence similarity information. It is generated by MoRFCHiBi using small windows of up to 40 residues in size. The second score identifies long stretches of protein disorder and is generated by ESpritz with the DisProt option. Lastly, the third score reflects residue conservation and is assembled from PSSM files generated by PSI-BLAST. These propensity scores are processed and then hierarchically combined using Bayes rule to generate the final MoRFCHiBi_Web predictions.
    Results: MoRFCHiBi_Web was tested on three datasets. Results show that MoRFCHiBi_Web outperforms previously developed predictors by generating less than half the false positive rate for the same true positive rate at practical threshold values. This level of accuracy paired with its relatively high processing speed makes MoRFCHiBi_Web a practical tool for MoRF prediction.
    Availability: http://morf.chibi.ubc.ca:8080/morf/.
    MeSH term(s) Amino Acid Sequence ; Bayes Theorem ; Computational Biology/methods ; Databases, Protein ; Humans ; Propensity Score ; Protein Structure, Tertiary ; Proteins/chemistry ; Proteins/genetics ; Proteins/metabolism ; Sequence Homology, Amino Acid
    Chemical Substances Proteins
    Language English
    Publishing date 2015-10-30
    Publishing country United States
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ISSN 1932-6203
    ISSN (online) 1932-6203
    DOI 10.1371/journal.pone.0141603
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

To top