Article ; Online: Faster SARS-CoV-2 sequence validation and annotation for GenBank using VADR.
NAR genomics and bioinformatics
2023 Volume 5, Issue 1, Page(s) lqad002
Abstract: In 2020 and 2021, >1.5 million SARS-CoV-2 sequences were submitted to GenBank. The initial version (v1.0) of the VADR (Viral Annotation DefineR) software package that GenBank uses to automatically validate and annotate incoming viral sequences is too ... ...
Abstract | In 2020 and 2021, >1.5 million SARS-CoV-2 sequences were submitted to GenBank. The initial version (v1.0) of the VADR (Viral Annotation DefineR) software package that GenBank uses to automatically validate and annotate incoming viral sequences is too slow and memory intensive to process many thousands of SARS-CoV-2 sequences in a reasonable amount of time. Additionally, long stretches of ambiguous N nucleotides, which are common in many SARS-CoV-2 sequences, prevent VADR from accurate validation and annotation. VADR has been updated to more accurately and rapidly annotate SARS-CoV-2 sequences. Stretches of consecutive Ns are now identified and temporarily replaced with expected nucleotides to facilitate processing, and the slowest steps have been overhauled using |
---|---|
Language | English |
Publishing date | 2023-01-20 |
Publishing country | England |
Document type | Journal Article |
ISSN | 2631-9268 |
ISSN (online) | 2631-9268 |
DOI | 10.1093/nargab/lqad002 |
Database | MEDical Literature Analysis and Retrieval System OnLINE |
More links
Kategorien
Order via subito
This service is chargeable due to the Delivery terms set by subito. Orders including an article and supplementary material will be classified as separate orders. In these cases, fees will be demanded for each order.
Inter-library loan at ZB MED
Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.