LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 48

Search options

  1. Article: hipFG: High-throughput harmonization and integration pipeline for functional genomics data.

    Cifello, Jeffrey / Kuksa, Pavel P / Saravanan, Naveensri / Valladares, Otto / Leung, Yuk Yee / Wang, Li-San

    bioRxiv : the preprint server for biology

    2023  

    Abstract: Preparing functional genomic (FG) data with diverse assay types and file formats for integration into analysis workflows that interpret genome-wide association and other studies is a significant and time-consuming challenge. Here we introduce hipFG, an ... ...

    Abstract Preparing functional genomic (FG) data with diverse assay types and file formats for integration into analysis workflows that interpret genome-wide association and other studies is a significant and time-consuming challenge. Here we introduce hipFG, an automatically customized pipeline for efficient and scalable normalization of heterogenous FG data collections into standardized, indexed, rapidly searchable analysis-ready datasets while accounting for FG datatypes (e.g., chromatin interactions, genomic intervals, quantitative trait loci).
    Language English
    Publishing date 2023-04-25
    Publishing country United States
    Document type Preprint
    DOI 10.1101/2023.04.21.537695
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  2. Article ; Online: hipFG: high-throughput harmonization and integration pipeline for functional genomics data.

    Cifello, Jeffrey / Kuksa, Pavel P / Saravanan, Naveensri / Valladares, Otto / Wang, Li-San / Leung, Yuk Yee

    Bioinformatics (Oxford, England)

    2023  Volume 39, Issue 11

    Abstract: Summary: Preparing functional genomic (FG) data with diverse assay types and file formats for integration into analysis workflows that interpret genome-wide association and other studies is a significant and time-consuming challenge. Here we introduce ... ...

    Abstract Summary: Preparing functional genomic (FG) data with diverse assay types and file formats for integration into analysis workflows that interpret genome-wide association and other studies is a significant and time-consuming challenge. Here we introduce hipFG (Harmonization and Integration Pipeline for Functional Genomics), an automatically customized pipeline for efficient and scalable normalization of heterogenous FG data collections into standardized, indexed, rapidly searchable analysis-ready datasets while accounting for FG datatypes (e.g. chromatin interactions, genomic intervals, quantitative trait loci).
    Availability and implementation: hipFG is freely available at https://bitbucket.org/wanglab-upenn/hipFG. A Docker container is available at https://hub.docker.com/r/wanglab/hipfg.
    MeSH term(s) Software ; Genome-Wide Association Study ; Genomics ; Chromatin ; Quantitative Trait Loci
    Chemical Substances Chromatin
    Language English
    Publishing date 2023-12-06
    Publishing country England
    Document type Journal Article
    ZDB-ID 1422668-6
    ISSN 1367-4811 ; 1367-4803
    ISSN (online) 1367-4811
    ISSN 1367-4803
    DOI 10.1093/bioinformatics/btad673
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  3. Article ; Online: A comparative study of structural variant calling in WGS from Alzheimer's disease families.

    Malamon, John S / Farrell, John J / Xia, Li Charlie / Dombroski, Beth A / Das, Rueben G / Way, Jessica / Kuzma, Amanda B / Valladares, Otto / Leung, Yuk Yee / Scanlon, Allison J / Lopez, Irving Antonio Barrera / Brehony, Jack / Worley, Kim C / Zhang, Nancy R / Wang, Li-San / Farrer, Lindsay A / Schellenberg, Gerard D / Lee, Wan-Ping / Vardarajan, Badri N

    Life science alliance

    2024  Volume 7, Issue 5

    Abstract: Detecting structural variants (SVs) in whole-genome sequencing poses significant challenges. We present a protocol for variant calling, merging, genotyping, sensitivity analysis, and laboratory validation for generating a high-quality SV call set in ... ...

    Abstract Detecting structural variants (SVs) in whole-genome sequencing poses significant challenges. We present a protocol for variant calling, merging, genotyping, sensitivity analysis, and laboratory validation for generating a high-quality SV call set in whole-genome sequencing from the Alzheimer's Disease Sequencing Project comprising 578 individuals from 111 families. Employing two complementary pipelines, Scalpel and Parliament, for SV/indel calling, we assessed sensitivity through sample replicates (N = 9) with in silico variant spike-ins. We developed a novel metric, D-score, to evaluate caller specificity for deletions. The accuracy of deletions was evaluated by Sanger sequencing. We generated a high-quality call set of 152,301 deletions of diverse sizes. Sanger sequencing validated 114 of 146 detected deletions (78.1%). Scalpel excelled in accuracy for deletions ≤100 bp, whereas Parliament was optimal for deletions >900 bp. Overall, 83.0% and 72.5% of calls by Scalpel and Parliament were validated, respectively, including all 11 deletions called by both Parliament and Scalpel between 101 and 900 bp. Our flexible protocol successfully generated a high-quality deletion call set and a truth set of Sanger sequencing-validated deletions with precise breakpoints spanning 1-17,000 bp.
    MeSH term(s) Humans ; Alzheimer Disease/genetics ; Whole Genome Sequencing/methods
    Language English
    Publishing date 2024-02-28
    Publishing country United States
    Document type Journal Article
    ISSN 2575-1077
    ISSN (online) 2575-1077
    DOI 10.26508/lsa.202302181
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  4. Article ; Online: NIAGADS Alzheimer's GenomicsDB: A resource for exploring Alzheimer's disease genetic and genomic knowledge.

    Greenfest-Allen, Emily / Valladares, Otto / Kuksa, Pavel P / Gangadharan, Prabhakaran / Lee, Wan-Ping / Cifello, Jeffrey / Katanic, Zivadin / Kuzma, Amanda B / Wheeler, Nicholas / Bush, William S / Leung, Yuk Yee / Schellenberg, Gerard / Stoeckert, Christian J / Wang, Li-San

    Alzheimer's & dementia : the journal of the Alzheimer's Association

    2023  Volume 20, Issue 2, Page(s) 1123–1136

    Abstract: Introduction: The National Institute on Aging Genetics of Alzheimer's Disease Data Storage Site Alzheimer's Genomics Database (GenomicsDB) is a public knowledge base of Alzheimer's disease (AD) genetic datasets and genomic annotations.: Methods: ... ...

    Abstract Introduction: The National Institute on Aging Genetics of Alzheimer's Disease Data Storage Site Alzheimer's Genomics Database (GenomicsDB) is a public knowledge base of Alzheimer's disease (AD) genetic datasets and genomic annotations.
    Methods: GenomicsDB uses a custom systems architecture to adopt and enforce rigorous standards that facilitate harmonization of AD-relevant genome-wide association study summary statistics datasets with functional annotations, including over 230 million annotated variants from the AD Sequencing Project.
    Results: GenomicsDB generates interactive reports compiled from the harmonized datasets and annotations. These reports contextualize AD-risk associations in a broader functional genomic setting and summarize them in the context of functionally annotated genes and variants.
    Discussion: Created to make AD-genetics knowledge more accessible to AD researchers, the GenomicsDB is designed to guide users unfamiliar with genetic data in not only exploring but also interpreting this ever-growing volume of data. Scalable and interoperable with other genomics resources using data technology standards, the GenomicsDB can serve as a central hub for research and data analysis on AD and related dementias.
    Highlights: The National Institute on Aging Genetics of Alzheimer's Disease Data Storage Site (NIAGADS) offers to the public a unique, disease-centric collection of AD-relevant GWAS summary statistics datasets. Interpreting these data is challenging and requires significant bioinformatics expertise to standardize datasets and harmonize them with functional annotations on genome-wide scales. The NIAGADS Alzheimer's GenomicsDB helps overcome these challenges by providing a user-friendly public knowledge base for AD-relevant genetics that shares harmonized, annotated summary statistics datasets from the NIAGADS repository in an interpretable, easily searchable format.
    MeSH term(s) United States ; Humans ; Alzheimer Disease/genetics ; Genome-Wide Association Study ; National Institute on Aging (U.S.) ; Genomics ; Databases, Factual ; Genetic Predisposition to Disease/genetics
    Language English
    Publishing date 2023-10-26
    Publishing country United States
    Document type Journal Article
    ZDB-ID 2211627-8
    ISSN 1552-5279 ; 1552-5260
    ISSN (online) 1552-5279
    ISSN 1552-5260
    DOI 10.1002/alz.13509
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  5. Article ; Online: FILER: a framework for harmonizing and querying large-scale functional genomics knowledge.

    Kuksa, Pavel P / Leung, Yuk Yee / Gangadharan, Prabhakaran / Katanic, Zivadin / Kleidermacher, Lauren / Amlie-Wolf, Alexandre / Lee, Chien-Yueh / Qu, Liming / Greenfest-Allen, Emily / Valladares, Otto / Wang, Li-San

    NAR genomics and bioinformatics

    2022  Volume 4, Issue 1, Page(s) lqab123

    Abstract: Querying massive functional genomic and annotation data collections, linking and summarizing the query results across data sources/data types are important steps in high-throughput genomic and genetic analytical workflows. However, these steps are made ... ...

    Abstract Querying massive functional genomic and annotation data collections, linking and summarizing the query results across data sources/data types are important steps in high-throughput genomic and genetic analytical workflows. However, these steps are made difficult by the heterogeneity and breadth of data sources, experimental assays, biological conditions/tissues/cell types and file formats. FILER (FunctIonaL gEnomics Repository) is a framework for querying large-scale genomics knowledge with a large, curated integrated catalog of harmonized functional genomic and annotation data coupled with a scalable genomic search and querying interface. FILER uniquely provides: (i) streamlined access to >50 000 harmonized, annotated genomic datasets across >20 integrated data sources, >1100 tissues/cell types and >20 experimental assays; (ii) a scalable genomic querying interface; and (iii) ability to analyze and annotate user's experimental data. This rich resource spans >17 billion GRCh37/hg19 and GRCh38/hg38 genomic records. Our benchmark querying 7 × 10
    Language English
    Publishing date 2022-01-14
    Publishing country England
    Document type Journal Article
    ISSN 2631-9268
    ISSN (online) 2631-9268
    DOI 10.1093/nargab/lqab123
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  6. Article ; Online: HIPPIE2: a method for fine-scale identification of physically interacting chromatin regions.

    Kuksa, Pavel P / Amlie-Wolf, Alexandre / Hwang, Yih-Chii / Valladares, Otto / Gregory, Brian D / Wang, Li-San

    NAR genomics and bioinformatics

    2020  Volume 2, Issue 2, Page(s) lqaa022

    Abstract: Most regulatory chromatin interactions are mediated by various transcription factors (TFs) and involve physically interacting elements such as enhancers, insulators or promoters. To map these elements and interactions at a fine scale, we developed ... ...

    Abstract Most regulatory chromatin interactions are mediated by various transcription factors (TFs) and involve physically interacting elements such as enhancers, insulators or promoters. To map these elements and interactions at a fine scale, we developed HIPPIE2 that analyzes raw reads from high-throughput chromosome conformation (Hi-C) experiments to identify precise loci of DNA physically interacting regions (PIRs). Unlike standard genome binning approaches (e.g. 10-kb to 1-Mb bins), HIPPIE2 dynamically infers the physical locations of PIRs using the distribution of restriction sites to increase analysis precision and resolution. We applied HIPPIE2 to
    Language English
    Publishing date 2020-03-31
    Publishing country England
    Document type Journal Article
    ISSN 2631-9268
    ISSN (online) 2631-9268
    DOI 10.1093/nargab/lqaa022
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  7. Article ; Online: DASHR 2.0: integrated database of human small non-coding RNA genes and mature products

    Kuksa, Pavel P. / Amlie-Wolf, Alexandre / Katanić, Živadin / Valladares, Otto / Wang, Lisan / Leung, Yuk Yee

    Bioinformatics. 2019 Mar. 15, v. 35, no. 6, p. 1033-1039

    2019  , Page(s) 1033–1039

    Abstract: Small non-coding RNAs (sncRNAs, <100 nts) are highly abundant RNAs that regulate diverse and often tissue-specific cellular processes by associating with transcription factor complexes or binding to mRNAs. While thousands of sncRNA genes exist in the ... ...

    Abstract Small non-coding RNAs (sncRNAs, <100 nts) are highly abundant RNAs that regulate diverse and often tissue-specific cellular processes by associating with transcription factor complexes or binding to mRNAs. While thousands of sncRNA genes exist in the human genome, no single resource provides searchable, unified annotation, expression and processing information for full sncRNA transcripts and mature RNA products derived from these larger RNAs. Our goal is to establish a complete catalog of annotation, expression, processing, conservation, tissue-specificity and other biological features for all human sncRNA genes and mature products derived from all major RNA classes. DASHR (Database of small human non-coding RNAs) v2.0 database is the first that integrates human sncRNA gene and mature products profiles obtained from multiple RNA-seq protocols. Altogether, 185 tissues/cell types and sncRNA annotations and >800 curated experiments from ENCODE and GEO/SRA across multiple RNA-seq protocols for both GRCh38/hg38 and GRCh37/hg19 assemblies are integrated in DASHR. Moreover, DASHR is the first to contain both known and novel, previously un-annotated sncRNA loci identified by unsupervised segmentation (13 times more loci with 1 678 800 total). Additionally, DASHR v2.0 adds >3 200 000 annotations for non-small RNA genes and other genomic features (long-noncoding RNAs, mRNAs, promoters, repeats). Furthermore, DASHR v2.0 introduces an enhanced user interface, interactive experiment-by-locus table view, sncRNA locus sorting and filtering by biological features. All annotation and expression information directly downloadable and accessible as UCSC genome browser tracks. DASHR v2.0 is freely available at https://lisanwanglab.org/DASHRv2. Supplementary data are available at Bioinformatics online.
    Keywords bioinformatics ; databases ; genes ; genomics ; humans ; loci ; non-coding RNA ; sequence analysis ; transcription factors ; user interface
    Language English
    Dates of publication 2019-0315
    Size p. 1033-1039
    Publishing place Oxford University Press
    Document type Article ; Online
    Note Use and reproduction
    ZDB-ID 1468345-3
    ISSN 1367-4811 ; 1460-2059
    ISSN 1367-4811 ; 1460-2059
    DOI 10.1093/bioinformatics/bty709
    Database NAL-Catalogue (AGRICOLA)

    More links

    Kategorien

  8. Article ; Online: Genetically regulated expression in late-onset Alzheimer's disease implicates risk genes within known and novel loci.

    Chen, Hung-Hsin / Petty, Lauren E / Sha, Jin / Zhao, Yi / Kuzma, Amanda / Valladares, Otto / Bush, William / Naj, Adam C / Gamazon, Eric R / Below, Jennifer E

    Translational psychiatry

    2021  Volume 11, Issue 1, Page(s) 618

    Abstract: Late-onset Alzheimer disease (LOAD) is highly polygenic, with a heritability estimated between 40 and 80%, yet risk variants identified in genome-wide studies explain only ~8% of phenotypic variance. Due to its increased power and interpretability, ... ...

    Abstract Late-onset Alzheimer disease (LOAD) is highly polygenic, with a heritability estimated between 40 and 80%, yet risk variants identified in genome-wide studies explain only ~8% of phenotypic variance. Due to its increased power and interpretability, genetically regulated expression (GReX) analysis is an emerging approach to investigate the genetic mechanisms of complex diseases. Here, we conducted GReX analysis within and across 51 tissues on 39 LOAD GWAS data sets comprising 58,713 cases and controls from the Alzheimer's Disease Genetics Consortium (ADGC) and the International Genomics of Alzheimer's Project (IGAP). Meta-analysis across studies identified 216 unique significant genes, including 72 with no previously reported LOAD GWAS associations. Cross-brain-tissue and cross-GTEx models revealed eight additional genes significantly associated with LOAD. Conditional analysis of previously reported loci using established LOAD-risk variants identified eight genes reaching genome-wide significance independent of known signals. Moreover, the proportion of SNP-based heritability is highly enriched in genes identified by GReX analysis. In summary, GReX-based meta-analysis in LOAD identifies 216 genes (including 72 novel genes), illuminating the role of gene regulatory models in LOAD.
    MeSH term(s) Alzheimer Disease/genetics ; Genetic Predisposition to Disease ; Genome-Wide Association Study ; Humans ; Multifactorial Inheritance ; Polymorphism, Single Nucleotide
    Language English
    Publishing date 2021-12-06
    Publishing country United States
    Document type Journal Article ; Meta-Analysis ; Research Support, N.I.H., Extramural
    ZDB-ID 2609311-X
    ISSN 2158-3188 ; 2158-3188
    ISSN (online) 2158-3188
    ISSN 2158-3188
    DOI 10.1038/s41398-021-01677-0
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  9. Article: Copy Number Variation Identification on 3,800 Alzheimer's Disease Whole Genome Sequencing Data from the Alzheimer's Disease Sequencing Project.

    Lee, Wan-Ping / Tucci, Albert A / Conery, Mitchell / Leung, Yuk Yee / Kuzma, Amanda B / Valladares, Otto / Chou, Yi-Fan / Lu, Wenbin / Wang, Li-San / Schellenberg, Gerard D / Tzeng, Jung-Ying

    Frontiers in genetics

    2021  Volume 12, Page(s) 752390

    Abstract: Alzheimer's Disease (AD) is a progressive neurologic disease and the most common form of dementia. While the causes of AD are not completely understood, genetics plays a key role in the etiology of AD, and thus finding genetic factors holds the potential ...

    Abstract Alzheimer's Disease (AD) is a progressive neurologic disease and the most common form of dementia. While the causes of AD are not completely understood, genetics plays a key role in the etiology of AD, and thus finding genetic factors holds the potential to uncover novel AD mechanisms. For this study, we focus on copy number variation (CNV) detection and burden analysis. Leveraging whole-genome sequence (WGS) data released by Alzheimer's Disease Sequencing Project (ADSP), we developed a scalable bioinformatics pipeline to identify CNVs. This pipeline was applied to 1,737 AD cases and 2,063 cognitively normal controls. As a result, we observed 237,306 and 42,767 deletions and duplications, respectively, with an average of 2,255 deletions and 1,820 duplications per subject. The burden tests show that Non-Hispanic-White cases on average have 16 more duplications than controls do (
    Language English
    Publishing date 2021-11-04
    Publishing country Switzerland
    Document type Journal Article
    ZDB-ID 2606823-0
    ISSN 1664-8021
    ISSN 1664-8021
    DOI 10.3389/fgene.2021.752390
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  10. Article ; Online: Alzheimer's Disease Variant Portal: A Catalog of Genetic Findings for Alzheimer's Disease.

    Kuksa, Pavel P / Liu, Chia-Lun / Fu, Wei / Qu, Liming / Zhao, Yi / Katanic, Zivadin / Clark, Kaylyn / Kuzma, Amanda B / Ho, Pei-Chuan / Tzeng, Kai-Teh / Valladares, Otto / Chou, Shin-Yi / Naj, Adam C / Schellenberg, Gerard D / Wang, Li-San / Leung, Yuk Yee

    Journal of Alzheimer's disease : JAD

    2022  Volume 86, Issue 1, Page(s) 461–477

    Abstract: Background: Recent Alzheimer's disease (AD) genetics findings from genome-wide association studies (GWAS) span progressively larger and more diverse populations and outcomes. Currently, there is no up-to-date resource providing harmonized and searchable ...

    Abstract Background: Recent Alzheimer's disease (AD) genetics findings from genome-wide association studies (GWAS) span progressively larger and more diverse populations and outcomes. Currently, there is no up-to-date resource providing harmonized and searchable information on all AD genetic associations found by GWAS, nor linking the reported genetic variants and genes with functional and genomic annotations.
    Objective: Create an integrated/harmonized, and literature-derived collection of population-specific AD genetic associations.
    Methods: We developed the Alzheimer's Disease Variant Portal (ADVP), an extensive collection of associations curated from >200 GWAS publications from Alzheimer's Disease Genetics Consortium and other consortia. Genetic associations were systematically extracted, harmonized, and annotated from both the genome-wide significant and suggestive loci reported in these publications. To ensure consistent representation of AD genetic findings, all the extracted genetic association information was harmonized across specifically designed publication, variant, and association categories.
    Results: ADVP V1.0 (February 2021) catalogs 6,990 associations related to disease-risk, expression quantitative traits, endophenotypes, or neuropathology. This extensive harmonization effort led to a catalog containing >900 loci, >1,800 variants, >80 cohorts, and 8 populations. Besides, ADVP provides investigators with a seamless integration of genomic and publicly available functional annotations across multiple databases per harmonized variant and gene records, thus facilitating further understanding and analyses of these genetics findings.
    Conclusion: ADVP is a valuable resource for investigators to quickly and systematically explore high-confidence AD genetic findings and provides insights into population-specific AD genetic architecture. ADVP is continually maintained and enhanced by NIAGADS and is freely accessible at https://advp.niagads.org.
    MeSH term(s) Alzheimer Disease/genetics ; Endophenotypes ; Genetic Predisposition to Disease/genetics ; Genome-Wide Association Study ; Humans ; Polymorphism, Single Nucleotide
    Language English
    Publishing date 2022-01-07
    Publishing country Netherlands
    Document type Journal Article ; Research Support, N.I.H., Extramural ; Research Support, Non-U.S. Gov't
    ZDB-ID 1440127-7
    ISSN 1875-8908 ; 1387-2877
    ISSN (online) 1875-8908
    ISSN 1387-2877
    DOI 10.3233/JAD-215055
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

To top