LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 197

Search options

  1. Article ; Online: HapCUT2: A Method for Phasing Genomes Using Experimental Sequence Data.

    Bansal, Vikas

    Methods in molecular biology (Clifton, N.J.)

    2022  Volume 2590, Page(s) 139–147

    Abstract: Rapid advances in high-throughput DNA sequencing technologies have enabled variant discovery from whole-genome sequencing (WGS) datasets; however linking variants on a chromosome together into haplotypes, also known as haplotype phasing, remains ... ...

    Abstract Rapid advances in high-throughput DNA sequencing technologies have enabled variant discovery from whole-genome sequencing (WGS) datasets; however linking variants on a chromosome together into haplotypes, also known as haplotype phasing, remains difficult. Human genomes are diploid and haplotype phasing is crucial for the complete interpretation and analysis of genetic variation.Hapcut2 ( https://github.com/vibansal/HapCUT2 ) is an open-source software for phasing diploid genomes using sequence data generated using different sequencing technologies and experimental methods. In this article, we give an overview of the algorithm used by Hapcut2 and describe how to use Hapcut2 for haplotype phasing of individual genomes using different types of sequence data.
    MeSH term(s) Humans ; Sequence Analysis, DNA/methods ; Polymorphism, Single Nucleotide ; Haplotypes ; Genome, Human ; High-Throughput Nucleotide Sequencing/methods ; Algorithms
    Language English
    Publishing date 2022-11-05
    Publishing country United States
    Document type Journal Article
    ISSN 1940-6029
    ISSN (online) 1940-6029
    DOI 10.1007/978-1-0716-2819-5_9
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  2. Article ; Online: A multilocus approach for accurate variant calling in low-copy repeats using whole-genome sequencing.

    Prodanov, Timofey / Bansal, Vikas

    Bioinformatics (Oxford, England)

    2023  Volume 39, Issue 39 Suppl 1, Page(s) i279–i287

    Abstract: Motivation: Low-copy repeats (LCRs) or segmental duplications are long segments of duplicated DNA that cover > 5% of the human genome. Existing tools for variant calling using short reads exhibit low accuracy in LCRs due to ambiguity in read mapping and ...

    Abstract Motivation: Low-copy repeats (LCRs) or segmental duplications are long segments of duplicated DNA that cover > 5% of the human genome. Existing tools for variant calling using short reads exhibit low accuracy in LCRs due to ambiguity in read mapping and extensive copy number variation. Variants in more than 150 genes overlapping LCRs are associated with risk for human diseases.
    Methods: We describe a short-read variant calling method, ParascopyVC, that performs variant calling jointly across all repeat copies and utilizes reads independent of mapping quality in LCRs. To identify candidate variants, ParascopyVC aggregates reads mapped to different repeat copies and performs polyploid variant calling. Subsequently, paralogous sequence variants that can differentiate repeat copies are identified using population data and used for estimating the genotype of variants for each repeat copy.
    Results: On simulated whole-genome sequence data, ParascopyVC achieved higher precision (0.997) and recall (0.807) than three state-of-the-art variant callers (best precision = 0.956 for DeepVariant and best recall = 0.738 for GATK) in 167 LCR regions. Benchmarking of ParascopyVC using the genome-in-a-bottle high-confidence variant calls for HG002 genome showed that it achieved a very high precision of 0.991 and a high recall of 0.909 across LCR regions, significantly better than FreeBayes (precision = 0.954 and recall = 0.822), GATK (precision = 0.888 and recall = 0.873) and DeepVariant (precision = 0.983 and recall = 0.861). ParascopyVC demonstrated a consistently higher accuracy (mean F1 = 0.947) than other callers (best F1 = 0.908) across seven human genomes.
    Availability and implementation: ParascopyVC is implemented in Python and is freely available at https://github.com/tprodanov/ParascopyVC.
    MeSH term(s) Humans ; Segmental Duplications, Genomic ; DNA Copy Number Variations ; Whole Genome Sequencing ; Benchmarking ; Genome, Human
    Language English
    Publishing date 2023-06-28
    Publishing country England
    Document type Journal Article ; Research Support, N.I.H., Extramural
    ZDB-ID 1422668-6
    ISSN 1367-4811 ; 1367-4803
    ISSN (online) 1367-4811
    ISSN 1367-4803
    DOI 10.1093/bioinformatics/btad268
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  3. Article ; Online: Integrating read-based and population-based phasing for dense and accurate haplotyping of individual genomes.

    Bansal, Vikas

    Bioinformatics (Oxford, England)

    2019  Volume 35, Issue 14, Page(s) i242–i248

    Abstract: Motivation: Reconstruction of haplotypes for human genomes is an important problem in medical and population genetics. Hi-C sequencing generates read pairs with long-range haplotype information that can be computationally assembled to generate ... ...

    Abstract Motivation: Reconstruction of haplotypes for human genomes is an important problem in medical and population genetics. Hi-C sequencing generates read pairs with long-range haplotype information that can be computationally assembled to generate chromosome-spanning haplotypes. However, the haplotypes have limited completeness and low accuracy. Haplotype information from population reference panels can potentially be used to improve the completeness and accuracy of Hi-C haplotyping.
    Results: In this paper, we describe a likelihood based method to integrate short-range haplotype information from a population reference panel of haplotypes with the long-range haplotype information present in sequence reads from methods such as Hi-C to assemble dense and highly accurate haplotypes for individual genomes. Our method leverages a statistical phasing method and a maximum spanning tree algorithm to determine the optimal second-order approximation of the population-based haplotype likelihood for an individual genome. The population-based likelihood is encoded using pseudo-reads which are then used as input along with sequence reads for haplotype assembly using an existing tool, HapCUT2. Using whole-genome Hi-C data for two human genomes (NA19240 and NA12878), we demonstrate that this integrated phasing method enables the phasing of 97-98% of variants, reduces the switch error rates by 3-6-fold, and outperforms an existing method for combining phase information from sequence reads with population-based phasing. On Strand-seq data for NA12878, our method improves the haplotype completeness from 71.4 to 94.6% and reduces the switch error rate 2-fold, demonstrating its utility for phasing using multiple sequencing technologies.
    Availability and implementation: Code and datasets are available at https://github.com/vibansal/IntegratedPhasing.
    MeSH term(s) Algorithms ; Genome, Human ; Haplotypes ; High-Throughput Nucleotide Sequencing ; Humans ; Likelihood Functions ; Polymorphism, Single Nucleotide ; Sequence Analysis, DNA
    Language English
    Publishing date 2019-08-26
    Publishing country England
    Document type Journal Article
    ZDB-ID 1422668-6
    ISSN 1367-4811 ; 1367-4803
    ISSN (online) 1367-4811
    ISSN 1367-4803
    DOI 10.1093/bioinformatics/btz329
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  4. Article ; Online: Robust and accurate estimation of paralog-specific copy number for duplicated genes using whole-genome sequencing.

    Prodanov, Timofey / Bansal, Vikas

    Nature communications

    2022  Volume 13, Issue 1, Page(s) 3221

    Abstract: The human genome contains hundreds of low-copy repeats (LCRs) that are challenging to analyze using short-read sequencing technologies due to extensive copy number variation and ambiguity in read mapping. Copy number and sequence variants in more than ... ...

    Abstract The human genome contains hundreds of low-copy repeats (LCRs) that are challenging to analyze using short-read sequencing technologies due to extensive copy number variation and ambiguity in read mapping. Copy number and sequence variants in more than 150 duplicated genes that overlap LCRs have been implicated in monogenic and complex human diseases. We describe a computational tool, Parascopy, for estimating the aggregate and paralog-specific copy number of duplicated genes using whole-genome sequencing (WGS). Parascopy is an efficient method that jointly analyzes reads mapped to different repeat copies without the need for global realignment. It leverages multiple samples to mitigate sequencing bias and to identify reliable paralogous sequence variants (PSVs) that differentiate repeat copies. Analysis of WGS data for 2504 individuals from diverse populations showed that Parascopy is robust to sequencing bias, has higher accuracy compared to existing methods and enables prioritization of pathogenic copy number changes in duplicated genes.
    MeSH term(s) DNA Copy Number Variations/genetics ; Genome, Human/genetics ; High-Throughput Nucleotide Sequencing/methods ; Humans ; Segmental Duplications, Genomic ; Sequence Analysis, DNA/methods ; Whole Genome Sequencing/methods
    Language English
    Publishing date 2022-06-09
    Publishing country England
    Document type Journal Article ; Research Support, N.I.H., Extramural
    ZDB-ID 2553671-0
    ISSN 2041-1723 ; 2041-1723
    ISSN (online) 2041-1723
    ISSN 2041-1723
    DOI 10.1038/s41467-022-30930-3
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  5. Article: Whole-exome sequencing in familial type 2 diabetes identifies an atypical missense variant in the RyR2 gene.

    Bansal, Vikas / Winkelmann, Bernhard R / Dietrich, Johannes W / Boehm, Bernhard O

    Frontiers in endocrinology

    2024  Volume 15, Page(s) 1258982

    Abstract: Genome-wide association studies have identified several hundred loci associated with type 2 diabetes mellitus (T2DM). Additionally, pathogenic variants in several genes are known to cause monogenic diabetes that overlaps clinically with T2DM. Whole-exome ...

    Abstract Genome-wide association studies have identified several hundred loci associated with type 2 diabetes mellitus (T2DM). Additionally, pathogenic variants in several genes are known to cause monogenic diabetes that overlaps clinically with T2DM. Whole-exome sequencing of related individuals with T2DM is a powerful approach to identify novel high-penetrance disease variants in coding regions of the genome. We performed whole-exome sequencing on four related individuals with T2DM - including one individual diagnosed at the age of 33 years. The individuals were negative for mutations in monogenic diabetes genes, had a strong family history of T2DM, and presented with several characteristics of metabolic syndrome. A missense variant (p.N2291D) in the type 2 ryanodine receptor (
    MeSH term(s) Adult ; Animals ; Humans ; Mice ; Diabetes Mellitus, Type 2/complications ; Diabetes Mellitus, Type 2/genetics ; Exome Sequencing ; Genome-Wide Association Study ; Glucose ; Glucose Intolerance ; Mutation, Missense ; Ryanodine Receptor Calcium Release Channel/genetics
    Chemical Substances Glucose (IY9XDZ35W2) ; Ryanodine Receptor Calcium Release Channel ; RyR2 protein, human ; ryanodine receptor 2. mouse
    Language English
    Publishing date 2024-02-20
    Publishing country Switzerland
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 2592084-4
    ISSN 1664-2392
    ISSN 1664-2392
    DOI 10.3389/fendo.2024.1258982
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  6. Article ; Online: A computational method for estimating the PCR duplication rate in DNA and RNA-seq experiments.

    Bansal, Vikas

    BMC bioinformatics

    2017  Volume 18, Issue Suppl 3, Page(s) 43

    Abstract: Background: PCR amplification is an important step in the preparation of DNA sequencing libraries prior to high-throughput sequencing. PCR amplification introduces redundant reads in the sequence data and estimating the PCR duplication rate is important ...

    Abstract Background: PCR amplification is an important step in the preparation of DNA sequencing libraries prior to high-throughput sequencing. PCR amplification introduces redundant reads in the sequence data and estimating the PCR duplication rate is important to assess the frequency of such reads. Existing computational methods do not distinguish PCR duplicates from "natural" read duplicates that represent independent DNA fragments and therefore, over-estimate the PCR duplication rate for DNA-seq and RNA-seq experiments.
    Results: In this paper, we present a computational method to estimate the average PCR duplication rate of high-throughput sequence datasets that accounts for natural read duplicates by leveraging heterozygous variants in an individual genome. Analysis of simulated data and exome sequence data from the 1000 Genomes project demonstrated that our method can accurately estimate the PCR duplication rate on paired-end as well as single-end read datasets which contain a high proportion of natural read duplicates. Further, analysis of exome datasets prepared using the Nextera library preparation method indicated that 45-50% of read duplicates correspond to natural read duplicates likely due to fragmentation bias. Finally, analysis of RNA-seq datasets from individuals in the 1000 Genomes project demonstrated that 70-95% of read duplicates observed in such datasets correspond to natural duplicates sampled from genes with high expression and identified outlier samples with a 2-fold greater PCR duplication rate than other samples.
    Conclusions: The method described here is a useful tool for estimating the PCR duplication rate of high-throughput sequence datasets and for assessing the fraction of read duplicates that correspond to natural read duplicates. An implementation of the method is available at https://github.com/vibansal/PCRduplicates .
    MeSH term(s) Alleles ; Cell Line ; Cluster Analysis ; Computational Biology/methods ; Computer Simulation ; DNA Fragmentation ; Databases, Genetic ; Exome ; Genome, Human ; Genotyping Techniques ; Heterozygote ; High-Throughput Nucleotide Sequencing ; Humans ; Models, Theoretical ; Polymerase Chain Reaction ; Sequence Analysis, DNA ; Sequence Analysis, RNA
    Language English
    Publishing date 2017-03-14
    Publishing country England
    Document type Journal Article
    ZDB-ID 2041484-5
    ISSN 1471-2105 ; 1471-2105
    ISSN (online) 1471-2105
    ISSN 1471-2105
    DOI 10.1186/s12859-017-1471-9
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  7. Article ; Online: An accurate algorithm for the detection of DNA fragments from dilution pool sequencing experiments.

    Bansal, Vikas

    Bioinformatics (Oxford, England)

    2017  Volume 34, Issue 1, Page(s) 155–162

    Abstract: Motivation: The short read lengths of current high-throughput sequencing technologies limit the ability to recover long-range haplotype information. Dilution pool methods for preparing DNA sequencing libraries from high molecular weight DNA fragments ... ...

    Abstract Motivation: The short read lengths of current high-throughput sequencing technologies limit the ability to recover long-range haplotype information. Dilution pool methods for preparing DNA sequencing libraries from high molecular weight DNA fragments enable the recovery of long DNA fragments from short sequence reads. These approaches require computational methods for identifying the DNA fragments using aligned sequence reads and assembling the fragments into long haplotypes. Although a number of computational methods have been developed for haplotype assembly, the problem of identifying DNA fragments from dilution pool sequence data has not received much attention.
    Results: We formulate the problem of detecting DNA fragments from dilution pool sequencing experiments as a genome segmentation problem and develop an algorithm that uses dynamic programming to optimize a likelihood function derived from a generative model for the sequence reads. This algorithm uses an iterative approach to automatically infer the mean background read depth and the number of fragments in each pool. Using simulated data, we demonstrate that our method, FragmentCut, has 25-30% greater sensitivity compared with an HMM based method for fragment detection and can also detect overlapping fragments. On a whole-genome human fosmid pool dataset, the haplotypes assembled using the fragments identified by FragmentCut had greater N50 length, 16.2% lower switch error rate and 35.8% lower mismatch error rate compared with two existing methods. We further demonstrate the greater accuracy of our method using two additional dilution pool datasets.
    Availability and implementation: FragmentCut is available from https://bansal-lab.github.io/software/FragmentCut.
    Contact: vibansal@ucsd.edu.
    Supplementary information: Supplementary data are available at Bioinformatics online.
    MeSH term(s) Algorithms ; Data Accuracy ; Genome, Human ; Haplotypes ; High-Throughput Nucleotide Sequencing/methods ; Humans ; Sequence Alignment/methods ; Sequence Analysis, DNA/methods ; Software
    Language English
    Publishing date 2017-09-28
    Publishing country England
    Document type Journal Article ; Research Support, N.I.H., Extramural
    ZDB-ID 1422668-6
    ISSN 1367-4811 ; 1367-4803
    ISSN (online) 1367-4811
    ISSN 1367-4803
    DOI 10.1093/bioinformatics/btx436
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  8. Article ; Online: Temperament and character in an Australian sample: examining cross-sectional associations of personality with age, sex, and satisfaction with life.

    Eley, Diann S / Bansal, Vikas / Cloninger, C Robert / Leung, Janni

    PeerJ

    2023  Volume 11, Page(s) e15342

    Abstract: Objective: Personality can influence how we interpret and react to our day-to-day life circumstances. Temperament and character are the primary dimensions of personality, and both are influenced genetically. Temperament represents our emotional core, ... ...

    Abstract Objective: Personality can influence how we interpret and react to our day-to-day life circumstances. Temperament and character are the primary dimensions of personality, and both are influenced genetically. Temperament represents our emotional core, while character reflects our goals and values as we develop through life. Research shows that where people live, their social, economic, and physical environment can influence attitudes and behaviors, and these have links to variations in personality traits. There are few studies that focus on Australian personality as temperament and character. Using an Australian general population sample, we examined the psychometric properties of the Temperament and Character Inventory (TCIR140) and investigated the associations between TCIR140 traits with both sociodemographic variables and measures of well-being. In addition, we investigated differences in temperament and character between our Australian general population sample and published results of similar studies from other countries.
    Methods: Australians (
    Results: Cronbach's alphas were high, ranging from
    Conclusions: Temperament and character are related to indicators of wellbeing and differs by age and sex. This Australian sample demonstrate a temperament that is high in Persistence and a character high in Self-Directedness and Cooperativeness with an overall postive affect and a general satisfaction with life. In comparison to other countries, Australians in this sample differ in levels of several traits, demonstrating a cautious and independent temperament with a character that is cooperative, industrious, and self-reliant. Young-adults in comparison to older groups have a temperament and character profile that is prone to negative emotions and a lower satisfaction with life.
    MeSH term(s) Female ; Humans ; Male ; Young Adult ; Australia ; Cross-Sectional Studies ; Personal Satisfaction ; Personality ; Temperament ; Psychometrics ; Resilience, Psychological
    Language English
    Publishing date 2023-05-11
    Publishing country United States
    Document type Journal Article
    ZDB-ID 2703241-3
    ISSN 2167-8359 ; 2167-8359
    ISSN (online) 2167-8359
    ISSN 2167-8359
    DOI 10.7717/peerj.15342
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  9. Article: Transcription start site signal profiling improves transposable element RNA expression analysis at locus-level.

    Savytska, Natalia / Heutink, Peter / Bansal, Vikas

    Frontiers in genetics

    2022  Volume 13, Page(s) 1026847

    Abstract: The transcriptional activity of Transposable Elements (TEs) has been involved in numerous pathological processes, including neurodegenerative diseases such as amyotrophic lateral sclerosis and frontotemporal lobar degeneration. The TE expression analysis ...

    Abstract The transcriptional activity of Transposable Elements (TEs) has been involved in numerous pathological processes, including neurodegenerative diseases such as amyotrophic lateral sclerosis and frontotemporal lobar degeneration. The TE expression analysis from short-read sequencing technologies is, however, challenging due to the multitude of similar sequences derived from singular TEs subfamilies and the exaptation of TEs within longer coding or non-coding RNAs. Specialised tools have been developed to quantify the expression of TEs that either relies on probabilistic re-distribution of multimapper count fractions or allow for discarding multimappers altogether. Until now, the benchmarking across those tools was largely limited to aggregated expression estimates over whole TEs subfamilies. Here, we compared the performance of recently published tools (SQuIRE, TElocal, SalmonTE) with simplistic quantification strategies (featureCounts in unique, fraction and random modes) at the individual loci level. Using simulated datasets, we examined the false discovery rate and the primary driver of those false positive hits in the optimal quantification strategy. Our findings suggest a high false discovery number that exceeds the total number of correctly recovered active loci for all the quantification strategies, including the best performing tool
    Language English
    Publishing date 2022-10-21
    Publishing country Switzerland
    Document type Journal Article
    ZDB-ID 2606823-0
    ISSN 1664-8021
    ISSN 1664-8021
    DOI 10.3389/fgene.2022.1026847
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  10. Article ; Online: Whole-genome sequencing of multiple related individuals with type 2 diabetes reveals an atypical likely pathogenic mutation in the PAX6 gene.

    Boehm, Bernhard O / Kratzer, Wolfgang / Bansal, Vikas

    European journal of human genetics : EJHG

    2022  Volume 31, Issue 1, Page(s) 89–96

    Abstract: Pathogenic variants in more than 14 genes have been implicated in monogenic diabetes; however, a significant fraction of individuals with young-onset diabetes and a strong family history of diabetes have unknown genetic etiology. To identify novel ... ...

    Abstract Pathogenic variants in more than 14 genes have been implicated in monogenic diabetes; however, a significant fraction of individuals with young-onset diabetes and a strong family history of diabetes have unknown genetic etiology. To identify novel pathogenic alleles for monogenic diabetes, we performed whole-genome sequencing (WGS) on four related individuals with type 2 diabetes - including one individual diagnosed at the age of 31 years - that were negative for mutations in known monogenic diabetes genes. The individuals were ascertained from a large case-control study and had a multi-generation family history of diabetes. Identity-by-descent (IBD) analysis revealed that the four individuals represent two sib-pairs that are third-degree relatives. A novel missense mutation (p.P81S) in the PAX6 gene was one of eight rare coding variants across the genome shared IBD by all individuals and was inherited from affected mothers in both sib-pairs. The mutation affects a highly conserved amino acid located in the paired-domain of PAX6 - a hotspot for missense mutations that cause aniridia and other eye abnormalities. However, no eye-related phenotype was observed in any individual. The well-established functional role of PAX6 in glucose-induced insulin secretion and the co-segregation of diabetes in families with aniridia provide compelling support for the pathogenicity of this mutation for diabetes. The mutation could be classified as "likely pathogenic" with a posterior probability of 0.975 according to the ACMG/AMP guidelines. This is the first PAX6 missense mutation that is likely pathogenic for autosomal-dominant adult-onset diabetes without eye abnormalities.
    MeSH term(s) Humans ; Diabetes Mellitus, Type 2/genetics ; PAX6 Transcription Factor/genetics ; Case-Control Studies ; Mutation ; Eye Abnormalities/genetics ; Aniridia/genetics ; Homeodomain Proteins/genetics ; Eye Proteins/genetics ; Pedigree
    Chemical Substances PAX6 Transcription Factor ; Homeodomain Proteins ; Eye Proteins ; PAX6 protein, human
    Language English
    Publishing date 2022-10-07
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 1141470-4
    ISSN 1476-5438 ; 1018-4813
    ISSN (online) 1476-5438
    ISSN 1018-4813
    DOI 10.1038/s41431-022-01182-y
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

To top