Book ; Online: An experimental sorting method for improving metagenomic data encoding
2024
Abstract: Minimizing data storage poses a significant challenge in large-scale metagenomic projects. In this paper, we present a new method for improving the encoding of FASTQ files generated by metagenomic sequencing. This method incorporates metagenomic ... ...
Abstract | Minimizing data storage poses a significant challenge in large-scale metagenomic projects. In this paper, we present a new method for improving the encoding of FASTQ files generated by metagenomic sequencing. This method incorporates metagenomic classification followed by a recursive filter for clustering reads by DNA sequence similarity to improve the overall reference-free compression. In the results, we show an overall improvement in the compression of several datasets. As hypothesized, we show a progressive compression gain for higher coverage depth and number of identified species. Additionally, we provide an implementation that is freely available at https://github.com/cobilab/mizar and can be customized to work with other FASTQ compression tools. |
---|---|
Keywords | Computer Science - Information Theory ; Quantitative Biology - Genomics |
Publishing date | 2024-01-03 |
Publishing country | us |
Document type | Book ; Online |
Database | BASE - Bielefeld Academic Search Engine (life sciences selection) |
Full text online
More links
Kategorien
Inter-library loan at ZB MED
Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.