LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 121

Search options

  1. Article ; Online: Efficiently quantifying DNA methylation for bulk- and single-cell bisulfite data.

    Fischer, Jonas / Schulz, Marcel H

    Bioinformatics (Oxford, England)

    2023  Volume 39, Issue 6

    Abstract: Motivation: DNA CpG methylation (CpGm) has proven to be a crucial epigenetic factor in the mammalian gene regulatory system. Assessment of DNA CpG methylation values via whole-genome bisulfite sequencing (WGBS) is, however, computationally extremely ... ...

    Abstract Motivation: DNA CpG methylation (CpGm) has proven to be a crucial epigenetic factor in the mammalian gene regulatory system. Assessment of DNA CpG methylation values via whole-genome bisulfite sequencing (WGBS) is, however, computationally extremely demanding.
    Results: We present FAst MEthylation calling (FAME), the first approach to quantify CpGm values directly from bulk or single-cell WGBS reads without intermediate output files. FAME is very fast but as accurate as standard methods, which first produce BS alignment files before computing CpGm values. We present experiments on bulk and single-cell bisulfite datasets in which we show that data analysis can be significantly sped-up and help addressing the current WGBS analysis bottleneck for large-scale datasets without compromising accuracy.
    Availability and implementation: An implementation of FAME is open source and licensed under GPL-3.0 at https://github.com/FischerJo/FAME.
    MeSH term(s) Animals ; DNA Methylation ; Software ; Sequence Analysis, DNA/methods ; High-Throughput Nucleotide Sequencing/methods ; Sulfites ; DNA/genetics ; Mammals/genetics
    Chemical Substances hydrogen sulfite (OJ9787WBLU) ; Sulfites ; DNA (9007-49-2)
    Language English
    Publishing date 2023-11-21
    Publishing country England
    Document type Journal Article ; Research Support, N.I.H., Extramural ; Research Support, Non-U.S. Gov't
    ZDB-ID 1422668-6
    ISSN 1367-4811 ; 1367-4803
    ISSN (online) 1367-4811
    ISSN 1367-4803
    DOI 10.1093/bioinformatics/btad386
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  2. Article ; Online: A statistical approach for identifying single nucleotide variants that affect transcription factor binding.

    Baumgarten, Nina / Rumpf, Laura / Kessler, Thorsten / Schulz, Marcel H

    iScience

    2024  Volume 27, Issue 5, Page(s) 109765

    Abstract: Non-coding variants located within regulatory elements may alter gene expression by modifying transcription factor (TF) binding sites, thereby leading to functional consequences. Different TF models are being used to assess the effect of DNA sequence ... ...

    Abstract Non-coding variants located within regulatory elements may alter gene expression by modifying transcription factor (TF) binding sites, thereby leading to functional consequences. Different TF models are being used to assess the effect of DNA sequence variants, such as single nucleotide variants (SNVs). Often existing methods are slow and do not assess statistical significance of results. We investigated the distribution of absolute maximal differential TF binding scores for general computational models that affect TF binding. We find that a modified Laplace distribution can adequately approximate the empirical distributions. A benchmark on
    Language English
    Publishing date 2024-04-18
    Publishing country United States
    Document type Journal Article
    ISSN 2589-0042
    ISSN (online) 2589-0042
    DOI 10.1016/j.isci.2024.109765
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  3. Article ; Online: Widespread effects of DNA methylation and intra-motif dependencies revealed by novel transcription factor binding models.

    Grau, Jan / Schmidt, Florian / Schulz, Marcel H

    Nucleic acids research

    2023  Volume 51, Issue 18, Page(s) e95

    Abstract: Several studies suggested that transcription factor (TF) binding to DNA may be impaired or enhanced by DNA methylation. We present MeDeMo, a toolbox for TF motif analysis that combines information about DNA methylation with models capturing intra-motif ... ...

    Abstract Several studies suggested that transcription factor (TF) binding to DNA may be impaired or enhanced by DNA methylation. We present MeDeMo, a toolbox for TF motif analysis that combines information about DNA methylation with models capturing intra-motif dependencies. In a large-scale study using ChIP-seq data for 335 TFs, we identify novel TFs that show a binding behaviour associated with DNA methylation. Overall, we find that the presence of CpG methylation decreases the likelihood of binding for the majority of methylation-associated TFs. For a considerable subset of TFs, we show that intra-motif dependencies are pivotal for accurately modelling the impact of DNA methylation on TF binding. We illustrate that the novel methylation-aware TF binding models allow to predict differential ChIP-seq peaks and improve the genome-wide analysis of TF binding. Our work indicates that simplistic models that neglect the effect of DNA methylation on DNA binding may lead to systematic underperformance for methylation-associated TFs.
    Language English
    Publishing date 2023-08-31
    Publishing country England
    Document type Journal Article
    ZDB-ID 186809-3
    ISSN 1362-4962 ; 1362-4954 ; 0301-5610 ; 0305-1048
    ISSN (online) 1362-4962 ; 1362-4954
    ISSN 0301-5610 ; 0305-1048
    DOI 10.1093/nar/gkad693
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  4. Article ; Online: Fast detection of differential chromatin domains with SCIDDO.

    Ebert, Peter / Schulz, Marcel H

    Bioinformatics (Oxford, England)

    2020  Volume 37, Issue 9, Page(s) 1198–1205

    Abstract: Motivation: The generation of genome-wide maps of histone modifications using chromatin immunoprecipitation sequencing is a standard approach to dissect the complexity of the epigenome. Interpretation and differential analysis of histone datasets ... ...

    Abstract Motivation: The generation of genome-wide maps of histone modifications using chromatin immunoprecipitation sequencing is a standard approach to dissect the complexity of the epigenome. Interpretation and differential analysis of histone datasets remains challenging due to regulatory meaningful co-occurrences of histone marks and their difference in genomic spread. To ease interpretation, chromatin state segmentation maps are a commonly employed abstraction combining individual histone marks. We developed the tool SCIDDO as a fast, flexible and statistically sound method for the differential analysis of chromatin state segmentation maps.
    Results: We demonstrate the utility of SCIDDO in a comparative analysis that identifies differential chromatin domains (DCD) in various regulatory contexts and with only moderate computational resources. We show that the identified DCDs correlate well with observed changes in gene expression and can recover a substantial number of differentially expressed genes (DEGs). We showcase SCIDDO's ability to directly interrogate chromatin dynamics, such as enhancer switches in downstream analysis, which simplifies exploring specific questions about regulatory changes in chromatin. By comparing SCIDDO to competing methods, we provide evidence that SCIDDO's performance in identifying DEGs via differential chromatin marking is more stable across a range of cell-type comparisons and parameter cut-offs.
    Availability and implementation: The SCIDDO source code is openly available under github.com/ptrebert/sciddo.
    Supplementary information: Supplementary data are available at Bioinformatics online.
    MeSH term(s) Chromatin ; Chromatin Immunoprecipitation ; Chromosomes ; Genome ; Histone Code
    Chemical Substances Chromatin
    Language English
    Publishing date 2020-11-19
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 1422668-6
    ISSN 1367-4811 ; 1367-4803
    ISSN (online) 1367-4811
    ISSN 1367-4803
    DOI 10.1093/bioinformatics/btaa960
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  5. Article ; Online: Letting the data speak for themselves: a fully Bayesian approach to transcriptome assembly.

    Schulz, Marcel H

    Genome biology

    2015  Volume 15, Issue 10, Page(s) 498

    Abstract: A novel method for transcriptome assembly, Bayesembler, provides greater accuracy without sacrifice of computational speed, and particular advantages for alternative transcripts expressed at low levels. ...

    Abstract A novel method for transcriptome assembly, Bayesembler, provides greater accuracy without sacrifice of computational speed, and particular advantages for alternative transcripts expressed at low levels.
    MeSH term(s) Bayes Theorem ; Gene Expression Profiling/methods ; Humans ; Software ; Transcriptome
    Language English
    Publishing date 2015-03-31
    Publishing country England
    Document type Journal Article ; Comment
    ZDB-ID 2040529-7
    ISSN 1474-760X ; 1465-6914 ; 1465-6906
    ISSN (online) 1474-760X ; 1465-6914
    ISSN 1465-6906
    DOI 10.1186/s13059-014-0498-8
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  6. Article: Multimodal analysis methods in predictive biomedicine.

    Qoku, Arber / Katsaouni, Nikoletta / Flinner, Nadine / Buettner, Florian / Schulz, Marcel H

    Computational and structural biotechnology journal

    2023  Volume 21, Page(s) 5829–5838

    Abstract: For medicine to fulfill its promise of personalized treatments based on a better understanding of disease biology, computational and statistical tools must exist to analyze the increasing amount of patient data that becomes available. A particular ... ...

    Abstract For medicine to fulfill its promise of personalized treatments based on a better understanding of disease biology, computational and statistical tools must exist to analyze the increasing amount of patient data that becomes available. A particular challenge is that several types of data are being measured to cope with the complexity of the underlying systems, enhance predictive modeling and enrich molecular understanding. Here we review a number of recent approaches that specialize in the analysis of multimodal data in the context of predictive biomedicine. We focus on methods that combine different OMIC measurements with image or genome variation data. Our overview shows the diversity of methods that address analysis challenges and reveals new avenues for novel developments.
    Language English
    Publishing date 2023-11-20
    Publishing country Netherlands
    Document type Journal Article ; Review
    ZDB-ID 2694435-2
    ISSN 2001-0370
    ISSN 2001-0370
    DOI 10.1016/j.csbj.2023.11.011
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  7. Article ; Online: Improving in-silico normalization using read weights.

    Durai, Dilip A / Schulz, Marcel H

    Scientific reports

    2019  Volume 9, Issue 1, Page(s) 5133

    Abstract: Specialized de novo assemblers for diverse datatypes have been developed and are in widespread use for the analyses of single-cell genomics, metagenomics and RNA-seq data. However, assembly of large sequencing datasets produced by modern technologies is ... ...

    Abstract Specialized de novo assemblers for diverse datatypes have been developed and are in widespread use for the analyses of single-cell genomics, metagenomics and RNA-seq data. However, assembly of large sequencing datasets produced by modern technologies is challenging and computationally intensive. In-silico read normalization has been suggested as a computational strategy to reduce redundancy in read datasets, which leads to significant speedups and memory savings of assembly pipelines. Previously, we presented a set multi-cover optimization based approach, ORNA, where reads are reduced without losing important k-mer connectivity information, as used in assembly graphs. Here we propose extensions to ORNA, named ORNA-Q and ORNA-K, which consider a weighted set multi-cover optimization formulation for the in-silico read normalization problem. These novel formulations make use of the base quality scores obtained from sequencers (ORNA-Q) or k-mer abundances of reads (ORNA-K) to improve normalization further. We devise efficient heuristic algorithms for solving both formulations. In applications to human RNA-seq data, ORNA-Q and ORNA-K are shown to assemble more or equally many full length transcripts compared to other normalization methods at similar or higher read reduction values. The algorithm is implemented under the latest version of ORNA (v2.0, https://github.com/SchulzLab/ORNA ).
    Language English
    Publishing date 2019-03-26
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 2615211-3
    ISSN 2045-2322 ; 2045-2322
    ISSN (online) 2045-2322
    ISSN 2045-2322
    DOI 10.1038/s41598-019-41502-9
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  8. Article ; Online: The adapted Activity-By-Contact model for enhancer-gene assignment and its application to single-cell data.

    Hecker, Dennis / Behjati Ardakani, Fatemeh / Karollus, Alexander / Gagneur, Julien / Schulz, Marcel H

    Bioinformatics (Oxford, England)

    2023  Volume 39, Issue 2

    Abstract: Motivation: Identifying regulatory regions in the genome is of great interest for understanding the epigenomic landscape in cells. One fundamental challenge in this context is to find the target genes whose expression is affected by the regulatory ... ...

    Abstract Motivation: Identifying regulatory regions in the genome is of great interest for understanding the epigenomic landscape in cells. One fundamental challenge in this context is to find the target genes whose expression is affected by the regulatory regions. A recent successful method is the Activity-By-Contact (ABC) model which scores enhancer-gene interactions based on enhancer activity and the contact frequency of an enhancer to its target gene. However, it describes regulatory interactions entirely from a gene's perspective, and does not account for all the candidate target genes of an enhancer. In addition, the ABC model requires two types of assays to measure enhancer activity, which limits the applicability. Moreover, there is neither implementation available that could allow for an integration with transcription factor (TF) binding information nor an efficient analysis of single-cell data.
    Results: We demonstrate that the ABC score can yield a higher accuracy by adapting the enhancer activity according to the number of contacts the enhancer has to its candidate target genes and also by considering all annotated transcription start sites of a gene. Further, we show that the model is comparably accurate with only one assay to measure enhancer activity. We combined our generalized ABC model with TF binding information and illustrated an analysis of a single-cell ATAC-seq dataset of the human heart, where we were able to characterize cell type-specific regulatory interactions and predict gene expression based on TF affinities. All executed processing steps are incorporated into our new computational pipeline STARE.
    Availability and implementation: The software is available at https://github.com/schulzlab/STARE.
    Contact: marcel.schulz@em.uni-frankfurt.de.
    Supplementary information: Supplementary data are available at Bioinformatics online.
    MeSH term(s) Humans ; Gene Expression Regulation ; Transcription Factors/metabolism ; Regulatory Sequences, Nucleic Acid ; Software ; Protein Binding
    Chemical Substances Transcription Factors
    Language English
    Publishing date 2023-01-27
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 1422668-6
    ISSN 1367-4811 ; 1367-4803
    ISSN (online) 1367-4811
    ISSN 1367-4803
    DOI 10.1093/bioinformatics/btad062
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  9. Article ; Online: On the problem of confounders in modeling gene expression

    Schmidt, Florian / Schulz, Marcel H.

    Bioinformatics. 2019 Feb. 15, v. 35, no. 4, p. 711-719

    2019  , Page(s) 711–719

    Abstract: Modeling of Transcription Factor (TF) binding from both ChIP-seq and chromatin accessibility data has become prevalent in computational biology. Several models have been proposed to generate new hypotheses on transcriptional regulation. However, there is ...

    Abstract Modeling of Transcription Factor (TF) binding from both ChIP-seq and chromatin accessibility data has become prevalent in computational biology. Several models have been proposed to generate new hypotheses on transcriptional regulation. However, there is no distinct approach to derive TF binding scores from ChIP-seq and open chromatin experiments. Here, we review biases of various scoring approaches and their effects on the interpretation and reliability of predictive gene expression models. We generated predictive models for gene expression using ChIP-seq and DNase1-seq data from DEEP and ENCODE. Via randomization experiments, we identified confounders in TF gene scores derived from both ChIP-seq and DNase1-seq data. We reviewed correction approaches for both data types, which reduced the influence of identified confounders without harm to model performance. Also, our analyses highlighted further quality control measures, in addition to model performance, that may help to assure model reliability and to avoid misinterpretation in future studies. The software used in this study is available online at https://github.com/SchulzLab/TEPIC. Supplementary data are available at Bioinformatics online.
    Keywords bioinformatics ; chromatin ; chromatin immunoprecipitation ; computer software ; gene expression ; genes ; model validation ; quality control ; transcription (genetics) ; transcription factors
    Language English
    Dates of publication 2019-0215
    Size p. 711-719
    Publishing place Oxford University Press
    Document type Article ; Online
    Note Use and reproduction
    ZDB-ID 1468345-3
    ISSN 1367-4811 ; 1460-2059
    ISSN 1367-4811 ; 1460-2059
    DOI 10.1093/bioinformatics/bty674
    Database NAL-Catalogue (AGRICOLA)

    More links

    Kategorien

  10. Article ; Online: In silico read normalization using set multi-cover optimization.

    Durai, Dilip A / Schulz, Marcel H

    Bioinformatics (Oxford, England)

    2018  Volume 34, Issue 19, Page(s) 3273–3280

    Abstract: Motivation: De Bruijn graphs are a common assembly data structure for sequencing datasets. But with the advances in sequencing technologies, assembling high coverage datasets has become a computational challenge. Read normalization, which removes ... ...

    Abstract Motivation: De Bruijn graphs are a common assembly data structure for sequencing datasets. But with the advances in sequencing technologies, assembling high coverage datasets has become a computational challenge. Read normalization, which removes redundancy in datasets, is widely applied to reduce resource requirements. Current normalization algorithms, though efficient, provide no guarantee to preserve important k-mers that form connections between regions in the graph.
    Results: Here, normalization is phrased as a set multi-cover problem on reads and a heuristic algorithm, Optimized Read Normalization Algorithm (ORNA), is proposed. ORNA normalizes to the minimum number of reads required to retain all k-mers and their relative k-mer abundances from the original dataset. Hence, all connections from the original graph are preserved. ORNA was tested on various RNA-seq datasets with different coverage values. It was compared to the current normalization algorithms and was found to be performing better. Normalizing error corrected data allows for more accurate assemblies compared to the normalized uncorrected dataset. Further, an application is proposed in which multiple datasets are combined and normalized to predict novel transcripts that would have been missed otherwise. Finally, ORNA is a general purpose normalization algorithm that is fast and significantly reduces datasets with loss of assembly quality in between [1, 30]% depending on reduction stringency.
    Availability and implementation: ORNA is available at https://github.com/SchulzLab/ORNA.
    Supplementary information: Supplementary data are available at Bioinformatics online.
    MeSH term(s) Algorithms ; Computational Biology ; Computer Simulation ; Sequence Analysis, RNA
    Language English
    Publishing date 2018-06-03
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 1422668-6
    ISSN 1367-4811 ; 1367-4803
    ISSN (online) 1367-4811
    ISSN 1367-4803
    DOI 10.1093/bioinformatics/bty307
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

To top