LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 59

Search options

  1. Article ; Online: Cooperation of Spaln and Prrn5 for Construction of Gene-Structure-Aware Multiple Sequence Alignment.

    Gotoh, Osamu

    Methods in molecular biology (Clifton, N.J.)

    2021  Volume 2231, Page(s) 71–88

    Abstract: Gene-structure-aware multiple sequence alignment (GSA-MSA) is conventionally used as a tool for analyzing evolutionary changes in gene structure, i.e., gain and loss of introns during the course of evolution of homologous eukaryotic genes. Recently, ... ...

    Abstract Gene-structure-aware multiple sequence alignment (GSA-MSA) is conventionally used as a tool for analyzing evolutionary changes in gene structure, i.e., gain and loss of introns during the course of evolution of homologous eukaryotic genes. Recently, however, it has become apparent that GSA-MSA is a powerful tool for detecting and remedying gene-prediction errors prevalent in genome annotations produced by various genome projects. Unfortunately, the construction of GSA-MSAs has so far required tedious procedures, thereby preventing researchers from enjoying the potential benefits of GSA-MSAs. In this chapter, we introduce a straightforward way for constructing GSA-MSAs when one or more genomic sequences and a set of transcript sequences (protein or full-length cDNAs/CDSs) are given. Our method requires no external tool or extra data, such as annotation files, although a supplementary script can generate a gene-structure-informed (GSI) transcript sequence file from annotation files.
    MeSH term(s) Algorithms ; Amino Acid Sequence ; Chromosome Mapping/methods ; DNA, Complementary/genetics ; Databases, Genetic ; Genomics/methods ; Introns ; Phylogeny ; RNA Splicing/genetics ; Sequence Alignment/methods ; Software
    Chemical Substances DNA, Complementary
    Language English
    Publishing date 2021-02-11
    Publishing country United States
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ISSN 1940-6029
    ISSN (online) 1940-6029
    DOI 10.1007/978-1-0716-1036-7_5
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  2. Article ; Online: Modeling one thousand intron length distributions with fitild.

    Gotoh, Osamu

    Bioinformatics (Oxford, England)

    2018  Volume 34, Issue 19, Page(s) 3258–3264

    Abstract: Motivation: Intron length distribution (ILD) is a specific feature of a genome that exhibits extensive species-specific variation. Whereas ILD contributes to up to 30% of the total information content for intron recognition in some species, rendering it ...

    Abstract Motivation: Intron length distribution (ILD) is a specific feature of a genome that exhibits extensive species-specific variation. Whereas ILD contributes to up to 30% of the total information content for intron recognition in some species, rendering it an important component of computational gene prediction, very few studies have been conducted to quantitatively characterize ILDs of various species.
    Results: We developed a set of computer programs (fitild, compild, etc.) to build statistical models of ILDs and compare them with one another. Each ILD of more than 1000 genomes was fitted with fitild to a statistical model consisting of one, two, or three components of Frechet distributions. Several measures of distances between ILDs were calculated by compild. A theoretical model was presented to better understand the origin of the observed shape of an ILD.
    Availability and implementation: The C++ source codes are available at https://github.com/ogotoh/fitild.git/.
    Supplementary information: Supplementary data are available at Bioinformatics online.
    MeSH term(s) Computational Biology ; Genome ; Introns ; Models, Statistical ; Software
    Language English
    Publishing date 2018-05-03
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 1422668-6
    ISSN 1367-4811 ; 1367-4803
    ISSN (online) 1367-4811
    ISSN 1367-4803
    DOI 10.1093/bioinformatics/bty353
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  3. Article ; Online: Modeling one thousand intron length distributions with fitild

    Gotoh, Osamu

    Bioinformatics. 2018 Oct. 01, v. 34, no. 19, p. 3258-3264

    2018  , Page(s) 3258–3264

    Abstract: Intron length distribution (ILD) is a specific feature of a genome that exhibits extensive species-specific variation. Whereas ILD contributes to up to 30% of the total information content for intron recognition in some species, rendering it an important ...

    Abstract Intron length distribution (ILD) is a specific feature of a genome that exhibits extensive species-specific variation. Whereas ILD contributes to up to 30% of the total information content for intron recognition in some species, rendering it an important component of computational gene prediction, very few studies have been conducted to quantitatively characterize ILDs of various species. We developed a set of computer programs (fitild, compild, etc.) to build statistical models of ILDs and compare them with one another. Each ILD of more than 1000 genomes was fitted with fitild to a statistical model consisting of one, two, or three components of Frechet distributions. Several measures of distances between ILDs were calculated by compild. A theoretical model was presented to better understand the origin of the observed shape of an ILD. The C++ source codes are available at https://github.com/ogotoh/fitild.git/. Supplementary data are available at Bioinformatics online.
    Keywords bioinformatics ; computers ; introns ; prediction ; statistical models ; theoretical models
    Language English
    Dates of publication 2018-1001
    Size p. 3258-3264
    Publishing place Oxford University Press
    Document type Article ; Online
    Note Use and reproduction
    ZDB-ID 1468345-3
    ISSN 1367-4811 ; 1460-2059
    ISSN 1367-4811 ; 1460-2059
    DOI 10.1093/bioinformatics/bty353
    Database NAL-Catalogue (AGRICOLA)

    More links

    Kategorien

  4. Article ; Online: Heuristic alignment methods.

    Gotoh, Osamu

    Methods in molecular biology (Clifton, N.J.)

    2014  Volume 1079, Page(s) 29–43

    Abstract: Computation of multiple sequence alignment (MSA) is usually formulated as a combinatory optimization problem of an objective function. Solving the problem for virtually all sensible objective functions is known to be NP-complete implying that some ... ...

    Abstract Computation of multiple sequence alignment (MSA) is usually formulated as a combinatory optimization problem of an objective function. Solving the problem for virtually all sensible objective functions is known to be NP-complete implying that some heuristics must be adopted. Several general strategies have been proven effective to obtain accurate MSAs in reasonable computational costs. This chapter is devoted to a brief summary of most successful heuristic approaches.
    MeSH term(s) Computational Biology ; Sequence Alignment/methods
    Language English
    Publishing date 2014
    Publishing country United States
    Document type Journal Article ; Research Support, Non-U.S. Gov't ; Review
    ISSN 1940-6029
    ISSN (online) 1940-6029
    DOI 10.1007/978-1-62703-646-7_2
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  5. Article ; Online: Evolution of cytochrome p450 genes from the viewpoint of genome informatics.

    Gotoh, Osamu

    Biological & pharmaceutical bulletin

    2012  Volume 35, Issue 6, Page(s) 812–817

    Abstract: Cytochrome P450 (CYP) constitutes a large gene superfamily descended from a single common ancestor. CYP genes are widely distributed in all domains of life from bacteria, archaea, and viruses to higher plants and animals. Because of their monophyletic ... ...

    Abstract Cytochrome P450 (CYP) constitutes a large gene superfamily descended from a single common ancestor. CYP genes are widely distributed in all domains of life from bacteria, archaea, and viruses to higher plants and animals. Because of their monophyletic nature, all CYP genes may be hierarchically classified at several distinct levels based on similarity of the protein amino acid sequences. A five-level classification (class, group, clan, family, and subfamily) is reasonably stable and useful for conceptual categorization of CYP genes. With a few exceptions, genes in a clan are specific to a kingdom or phylum, whereas cross-kingdom genes may belong to the same group, indicating an ancient origin of CYP diversification. CYP proteins are often functionally categorized into catalysts of "endogenous," "secondary," and "xenobiotic" compounds according to their substrate specificities. It was once postulated that xenobiotic-metabolizing enzymes were derived from an endogenous substrate-catalyzing enzyme. Although functional flow from endogenous to xenobiotic substrates occurred, recent evidence from a wide range of genomic analyses has indicated that the opposite is the more dominant stream. Expression of most vertebrate CYP genes is regulated by internal and external stimuli through transcription factors in the nuclear receptor family and bHLH-PAS family. Some aspects of cooperative evolution between transcriptional regulators and their target genes are briefly reviewed.
    MeSH term(s) Animals ; Cytochrome P-450 Enzyme System/genetics ; Evolution, Molecular ; Genome ; Humans
    Chemical Substances Cytochrome P-450 Enzyme System (9035-51-2)
    Language English
    Publishing date 2012-05-24
    Publishing country Japan
    Document type Journal Article ; Review
    ZDB-ID 1150271-x
    ISSN 1347-5215 ; 0918-6158
    ISSN (online) 1347-5215
    ISSN 0918-6158
    DOI 10.1248/bpb.35.812
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  6. Article ; Online: Genomic alterations in gynecological malignancies: histotype-associated driver mutations, molecular subtyping schemes, and tumorigenic mechanisms.

    Mori, Seiichi / Gotoh, Osamu / Kiyotani, Kazuma / Low, Siew Kee

    Journal of human genetics

    2021  Volume 66, Issue 9, Page(s) 853–868

    Abstract: There are numerous histological subtypes (histotypes) of gynecological malignancies, with each histotype considered to largely reflect a feature of the "cell of origin," and to be tightly linked with the clinical behavior and biological phenotype of the ... ...

    Abstract There are numerous histological subtypes (histotypes) of gynecological malignancies, with each histotype considered to largely reflect a feature of the "cell of origin," and to be tightly linked with the clinical behavior and biological phenotype of the tumor. The recent advances in massive parallel sequencing technologies have provided a more complete picture of the range of the genomic alterations that can persist within individual tumors, and have highlighted the types and frequencies of driver-gene mutations and molecular subtypes often associated with these histotypes. Several large-scale genomic cohorts, including the Cancer Genome Atlas (TCGA), have been used to characterize the genomic features of a range of gynecological malignancies, including high-grade serous ovarian carcinoma, uterine corpus endometrial carcinoma, uterine cervical carcinoma, and uterine carcinosarcoma. These datasets have also been pivotal in identifying clinically relevant molecular targets and biomarkers, and in the construction of molecular subtyping schemes. In addition, the recent widespread use of clinical sequencing for the more ubiquitous types of gynecological cancer has manifested in a series of large genomic datasets that have allowed the characterization of the genomes, driver mutations, and histotypes of even rare cancer types, with sufficient statistical power. Here, we review the field of gynecological cancer, and seek to describe the genomic features by histotype. We also will demonstrate how these are linked with clinicopathological attributes and highlight the potential tumorigenic mechanisms.
    MeSH term(s) Carcinogenesis/genetics ; Female ; Genital Neoplasms, Female/genetics ; Genomics ; Humans ; Mutation
    Language English
    Publishing date 2021-06-07
    Publishing country England
    Document type Journal Article ; Review
    ZDB-ID 1425192-9
    ISSN 1435-232X ; 1434-5161
    ISSN (online) 1435-232X
    ISSN 1434-5161
    DOI 10.1038/s10038-021-00940-y
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  7. Article ; Online: Evolutionary origin of mitochondrial cytochrome P450.

    Omura, Tsuneo / Gotoh, Osamu

    Journal of biochemistry

    2017  Volume 161, Issue 5, Page(s) 399–407

    Abstract: Different molecular species of cytochrome P450 (P450) are distributed between endoplasmic reticulum (microsomes) and mitochondria in animal cells. Plants and fungi have many microsomal P450s, but no mitochondrial P450 has so far been reported. To ... ...

    Abstract Different molecular species of cytochrome P450 (P450) are distributed between endoplasmic reticulum (microsomes) and mitochondria in animal cells. Plants and fungi have many microsomal P450s, but no mitochondrial P450 has so far been reported. To elucidate the evolutionary origin of mitochondrial P450s in animal cells, available evidence is examined, and the virtual absence of mitochondrial P450 in plants and fungi is confirmed. It is also suggested that a microsomal P450 is the ancestor of animal mitochondrial P450s. It is likely that the endoplasmic reticulum-targeting sequence at the amino-terminus of a microsomal P450 was converted to a mitochondria-targeting sequence possibly by point mutations of a few amino acid residues or by an exon-shuffling/moving event shortly after animal lineage diverged from plants and fungi in the course of evolution of eukaryotes. It is suggested that the microsome-type P450 first imported into mitochondria utilized the existing ferredoxin in the matrix to receive electrons from NADPH, retained its oxygenase activity in the mitochondria, and gradually diversified to several P450s with different substrate specificities in the course of the evolution of animals.
    MeSH term(s) Animals ; Cytochrome P-450 Enzyme System/metabolism ; Fungi/metabolism ; Mitochondria/enzymology ; Mitochondria/metabolism ; Plants/metabolism
    Chemical Substances Cytochrome P-450 Enzyme System (9035-51-2)
    Language English
    Publishing date 2017-05-01
    Publishing country England
    Document type Journal Article
    ZDB-ID 218073-x
    ISSN 1756-2651 ; 0021-924X
    ISSN (online) 1756-2651
    ISSN 0021-924X
    DOI 10.1093/jb/mvx011
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  8. Article ; Online: A space-efficient and accurate method for mapping and aligning cDNA sequences onto genomic sequence.

    Gotoh, Osamu

    Nucleic acids research

    2008  Volume 36, Issue 8, Page(s) 2630–2638

    Abstract: The mapping and alignment of transcripts (cDNA, expressed sequence tag or amino acid sequences) onto a genomic sequence is a fundamental step for genome annotation, including gene finding and analyses of transcriptional activity, alternative splicing and ...

    Abstract The mapping and alignment of transcripts (cDNA, expressed sequence tag or amino acid sequences) onto a genomic sequence is a fundamental step for genome annotation, including gene finding and analyses of transcriptional activity, alternative splicing and nucleotide polymorphisms. As DNA sequence data of genomes and transcripts are accumulating at an unprecedented rate, steady improvement in accuracy, speed and space requirement in the computational tools for mapping/alignment is desired. We devised a multi-phase heuristic algorithm and implemented it in the development of the stand-alone computer program Spaln (space-efficient spliced alignment). Spaln is reasonably fast and space efficient; it requires <1 Gb of memory to map and align >120 000 Unigene sequences onto the unmasked whole human genome with a conventional computer, finishing the job in <6 h. With artificially introduced noise of various levels, Spaln significantly outperforms other leading alignment programs currently available with respect to the accuracy of mapped exon-intron structures. This performance is achieved without extensive learning procedures to adjust parameter values to a particular organism. According to the handiness and accuracy, Spaln may be used for studies on a wide area of genome analyses.
    MeSH term(s) Algorithms ; Animals ; Chromosome Mapping/methods ; DNA, Complementary/chemistry ; Genomics/methods ; Humans ; Mice ; Sequence Alignment/methods ; Software
    Chemical Substances DNA, Complementary
    Language English
    Publishing date 2008-03-15
    Publishing country England
    Document type Evaluation Study ; Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 186809-3
    ISSN 1362-4962 ; 1362-4954 ; 0301-5610 ; 0305-1048
    ISSN (online) 1362-4962 ; 1362-4954
    ISSN 0301-5610 ; 0305-1048
    DOI 10.1093/nar/gkn105
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  9. Article ; Online: Direct mapping and alignment of protein sequences onto genomic sequence.

    Gotoh, Osamu

    Bioinformatics (Oxford, England)

    2008  Volume 24, Issue 21, Page(s) 2438–2444

    Abstract: Motivation: Finding protein-coding genes in a newly determined genomic sequence is the first step toward understanding the content written in the genome. Sequences of transcripts of homologous genes, if available, can considerably improve accuracy of ... ...

    Abstract Motivation: Finding protein-coding genes in a newly determined genomic sequence is the first step toward understanding the content written in the genome. Sequences of transcripts of homologous genes, if available, can considerably improve accuracy of prediction of genes and their structures, compared with that without such knowledge. As protein sequences are generally better conserved than nucleotide sequences, remote homologs can be used as templates, extending the applicability of evidence-based gene recognition methods. However, no tool seems to have been developed so far to simultaneously map and align a number of protein sequences on mammalian-sized genomic sequence.
    Results: We have extended our computer program Spaln to accept protein sequences, as well as cDNA sequences, as queries. When the query and the target sequences are reasonably similar, e.g. between mammalian orthologs, Spaln runs one to two orders of magnitude faster than conventional approaches that rely on Blast search followed by dynamic-programming-based spliced alignment. Exon-level and gene-level accuracies of Spaln are significantly higher than those obtained by the best available methods of the same type, particularly when the query and the target are distantly related.
    Availability: Spaln is accessible online for a few species at http://www.genome.ist.i.kyoto-u.ac.jp/~aln_user. The source code is available for free for academic users from the same site.
    MeSH term(s) Algorithms ; Amino Acid Sequence ; DNA, Complementary/genetics ; Exons ; Genome ; Genomics ; Molecular Sequence Data ; Proteins/chemistry ; Proteins/genetics ; Sequence Alignment ; Sequence Analysis, Protein/methods
    Chemical Substances DNA, Complementary ; Proteins
    Language English
    Publishing date 2008-11-01
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 1422668-6
    ISSN 1367-4811 ; 1367-4803
    ISSN (online) 1367-4811
    ISSN 1367-4803
    DOI 10.1093/bioinformatics/btn460
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  10. Article ; Online: A method of sample-wise region-set enrichment analysis for DNA methylomics.

    Minegishi, Ryu / Gotoh, Osamu / Tanaka, Norio / Maruyama, Reo / Chang, Jeffrey T / Mori, Seiichi

    Epigenomics

    2021  Volume 13, Issue 14, Page(s) 1081–1093

    Abstract: Aim: ...

    Abstract Aim:
    MeSH term(s) Algorithms ; Biomarkers ; Blood Cells/metabolism ; Computational Biology/methods ; CpG Islands ; DNA Methylation ; Epigenesis, Genetic ; Epigenomics/methods ; Gene Expression Profiling/methods ; Humans ; Molecular Sequence Annotation ; Transcriptome
    Chemical Substances Biomarkers
    Language English
    Publishing date 2021-07-09
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't ; Research Support, U.S. Gov't, Non-P.H.S.
    ZDB-ID 2537199-X
    ISSN 1750-192X ; 1750-1911
    ISSN (online) 1750-192X
    ISSN 1750-1911
    DOI 10.2217/epi-2021-0065
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

To top