Article ; Online: SANS: high-throughput retrieval of protein sequences allowing 50% mismatches.
Bioinformatics (Oxford, England)
2012 Volume 28, Issue 18, Page(s) i438–i443
Abstract: Motivation: The genomic era in molecular biology has brought on a rapidly widening gap between the amount of sequence data and first-hand experimental characterization of proteins. Fortunately, the theory of evolution provides a simple solution: ... ...
Abstract | Motivation: The genomic era in molecular biology has brought on a rapidly widening gap between the amount of sequence data and first-hand experimental characterization of proteins. Fortunately, the theory of evolution provides a simple solution: functional and structural information can be transferred between homologous proteins. Sequence similarity searching followed by k-nearest neighbor classification is the most widely used tool to predict the function or structure of anonymous gene products that come out of genome sequencing projects. Results: We present a novel word filter, suffix array neighborhood search (SANS), to identify protein sequence similarities in the range of 50-100% identity with sensitivity comparable to BLAST and 10 times the speed of USEARCH. In contrast to these previous approaches, the complexity of the search is proportional only to the length of the query sequence and independent of database size, enabling fast searching and functional annotation into the future despite rapidly expanding databases. Availability and implementation: The software is freely available to non-commercial users from our website http://ekhidna.biocenter.helsinki.fi/downloads/sans. Contact: liisa.holm@helsinki.fi. |
---|---|
MeSH term(s) | Algorithms ; Bacterial Proteins/chemistry ; Bacterial Proteins/genetics ; Databases, Protein ; Genome ; Metagenome ; Sequence Alignment ; Sequence Analysis, Protein/methods ; Sequence Homology, Amino Acid ; Software |
Chemical Substances | Bacterial Proteins |
Language | English |
Publishing date | 2012-09-07 |
Publishing country | England |
Document type | Journal Article ; Research Support, Non-U.S. Gov't |
ZDB-ID | 1422668-6 |
ISSN | 1367-4811 ; 1367-4803 |
ISSN (online) | 1367-4811 |
ISSN | 1367-4803 |
DOI | 10.1093/bioinformatics/bts417 |
Database | MEDical Literature Analysis and Retrieval System OnLINE |
More links
Kategorien
In stock of ZB MED Cologne/Königswinter
Zs.A 2374: Show issues | Location: Je nach Verfügbarkeit (siehe Angabe bei Bestand) bis Jg. 1994: Bestellungen von Artikeln über das Online-Bestellformular Jg. 1995 - 2021: Lesesall (2.OG) ab Jg. 2022: Lesesaal (EG) |
Order via subito
This service is chargeable due to the Delivery terms set by subito. Orders including an article and supplementary material will be classified as separate orders. In these cases, fees will be demanded for each order.