LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 102

Search options

  1. Article ; Online: Structured sparsity regularization for analyzing high-dimensional omics data.

    Vinga, Susana

    Briefings in bioinformatics

    2020  Volume 22, Issue 1, Page(s) 77–87

    Abstract: The development of new molecular and cell technologies is having a significant impact on the quantity of data generated nowadays. The growth of omics databases is creating a considerable potential for knowledge discovery and, concomitantly, is bringing ... ...

    Abstract The development of new molecular and cell technologies is having a significant impact on the quantity of data generated nowadays. The growth of omics databases is creating a considerable potential for knowledge discovery and, concomitantly, is bringing new challenges to statistical learning and computational biology for health applications. Indeed, the high dimensionality of these data may hamper the use of traditional regression methods and parameter estimation algorithms due to the intrinsic non-identifiability of the inherent optimization problem. Regularized optimization has been rising as a promising and useful strategy to solve these ill-posed problems by imposing additional constraints in the solution parameter space. In particular, the field of statistical learning with sparsity has been significantly contributing to building accurate models that also bring interpretability to biological observations and phenomena. Beyond the now-classic elastic net, one of the best-known methods that combine lasso with ridge penalizations, we briefly overview recent literature on structured regularizers and penalty functions that have been applied in biomedical data to build parsimonious models in a variety of underlying contexts, from survival to generalized linear models. These methods include functions of $\ell _k$-norms and network-based penalties that take into account the inherent relationships between the features. The successful application to omics data illustrates the potential of sparse structured regularization for identifying disease's molecular signatures and for creating high-performance clinical decision support systems towards more personalized healthcare. Supplementary information: Supplementary data are available at Briefings in Bioinformatics online.
    MeSH term(s) Algorithms ; Computational Biology/methods ; Decision Support Systems, Clinical/standards ; Humans ; Precision Medicine/methods
    Language English
    Publishing date 2020-06-27
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't ; Review
    ZDB-ID 2068142-2
    ISSN 1477-4054 ; 1467-5463
    ISSN (online) 1477-4054
    ISSN 1467-5463
    DOI 10.1093/bib/bbaa122
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  2. Article ; Online: Kidney Cancer Biomarker Selection Using Regularized Survival Models.

    Peixoto, Carolina / Martins, Marta / Costa, Luís / Vinga, Susana

    Cells

    2022  Volume 11, Issue 15

    Abstract: Clear cell renal cell carcinoma (ccRCC) is the most common subtype of RCC showing a significant percentage of mortality. One of the priorities of kidney cancer research is to identify RCC-specific biomarkers for early detection and screening of the ... ...

    Abstract Clear cell renal cell carcinoma (ccRCC) is the most common subtype of RCC showing a significant percentage of mortality. One of the priorities of kidney cancer research is to identify RCC-specific biomarkers for early detection and screening of the disease. With the development of high-throughput technology, it is now possible to measure the expression levels of thousands of genes in parallel and assess the molecular profile of individual tumors. Studying the relationship between gene expression and survival outcome has been widely used to find genes associated with cancer survival, providing new information for clinical decision-making. One of the challenges of using transcriptomics data is their high dimensionality which can lead to instability in the selection of gene signatures. Here we identify potential prognostic biomarkers correlated to the survival outcome of ccRCC patients using two network-based regularizers (EN and TCox) applied to Cox models. Some genes always selected by each method were found (
    MeSH term(s) Biomarkers, Tumor/genetics ; Biomarkers, Tumor/metabolism ; Carcinoma, Renal Cell/metabolism ; Humans ; Kidney/pathology ; Kidney Neoplasms/genetics ; Kidney Neoplasms/pathology ; Transcriptome/genetics
    Chemical Substances Biomarkers, Tumor
    Language English
    Publishing date 2022-07-27
    Publishing country Switzerland
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 2661518-6
    ISSN 2073-4409 ; 2073-4409
    ISSN (online) 2073-4409
    ISSN 2073-4409
    DOI 10.3390/cells11152311
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  3. Article ; Online: Tracking intratumoral heterogeneity in glioblastoma via regularized classification of single-cell RNA-Seq data.

    Lopes, Marta B / Vinga, Susana

    BMC bioinformatics

    2020  Volume 21, Issue 1, Page(s) 59

    Abstract: Background: Understanding cellular and molecular heterogeneity in glioblastoma (GBM), the most common and aggressive primary brain malignancy, is a crucial step towards the development of effective therapies. Besides the inter-patient variability, the ... ...

    Abstract Background: Understanding cellular and molecular heterogeneity in glioblastoma (GBM), the most common and aggressive primary brain malignancy, is a crucial step towards the development of effective therapies. Besides the inter-patient variability, the presence of multiple cell populations within tumors calls for the need to develop modeling strategies able to extract the molecular signatures driving tumor evolution and treatment failure. With the advances in single-cell RNA Sequencing (scRNA-Seq), tumors can now be dissected at the cell level, unveiling information from their life history to their clinical implications.
    Results: We propose a classification setting based on GBM scRNA-Seq data, through sparse logistic regression, where different cell populations (neoplastic and normal cells) are taken as classes. The goal is to identify gene features discriminating between the classes, but also those shared by different neoplastic clones. The latter will be approached via the network-based twiner regularizer to identify gene signatures shared by neoplastic cells from the tumor core and infiltrating neoplastic cells originated from the tumor periphery, as putative disease biomarkers to target multiple neoplastic clones. Our analysis is supported by the literature through the identification of several known molecular players in GBM. Moreover, the relevance of the selected genes was confirmed by their significance in the survival outcomes in bulk GBM RNA-Seq data, as well as their association with several Gene Ontology (GO) biological process terms.
    Conclusions: We presented a methodology intended to identify genes discriminating between GBM clones, but also those playing a similar role in different GBM neoplastic clones (including migrating cells), therefore potential targets for therapy research. Our results contribute to a deeper understanding on the genetic features behind GBM, by disclosing novel therapeutic directions accounting for GBM heterogeneity.
    MeSH term(s) Brain Neoplasms/genetics ; Brain Neoplasms/metabolism ; Classification/methods ; Gene Ontology ; Glioblastoma/genetics ; Glioblastoma/metabolism ; Humans ; RNA-Seq ; Single-Cell Analysis
    Language English
    Publishing date 2020-02-18
    Publishing country England
    Document type Journal Article
    ZDB-ID 2041484-5
    ISSN 1471-2105 ; 1471-2105
    ISSN (online) 1471-2105
    ISSN 1471-2105
    DOI 10.1186/s12859-020-3390-4
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  4. Article ; Online: Editorial: Alignment-free methods in computational biology.

    Vinga, Susana

    Briefings in bioinformatics

    2014  Volume 15, Issue 3, Page(s) 341–342

    MeSH term(s) Computational Biology/methods ; Computational Biology/trends ; Genomics ; High-Throughput Nucleotide Sequencing ; Information Theory ; Sequence Alignment ; Sequence Analysis
    Language English
    Publishing date 2014-05
    Publishing country England
    Document type Editorial ; Introductory Journal Article
    ZDB-ID 2068142-2
    ISSN 1477-4054 ; 1467-5463
    ISSN (online) 1477-4054
    ISSN 1467-5463
    DOI 10.1093/bib/bbu005
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  5. Article ; Online: ROSIE: RObust Sparse ensemble for outlIEr detection and gene selection in cancer omics data.

    Jensch, Antje / Lopes, Marta B / Vinga, Susana / Radde, Nicole

    Statistical methods in medical research

    2022  Volume 31, Issue 5, Page(s) 947–958

    Abstract: The extraction of novel information from omics data is a challenging task, in particular, since the number of features (e.g. genes) often far exceeds the number of samples. In such a setting, conventional parameter estimation leads to ill-posed ... ...

    Abstract The extraction of novel information from omics data is a challenging task, in particular, since the number of features (e.g. genes) often far exceeds the number of samples. In such a setting, conventional parameter estimation leads to ill-posed optimization problems, and regularization may be required. In addition, outliers can largely impact classification accuracy.Here we introduce ROSIE, an ensemble classification approach, which combines three sparse and robust classification methods for outlier detection and feature selection and further performs a bootstrap-based validity check. Outliers of ROSIE are determined by the rank product test using outlier rankings of all three methods, and important features are selected as features commonly selected by all methods.We apply ROSIE to RNA-Seq data from The Cancer Genome Atlas (TCGA) to classify observations into Triple-Negative Breast Cancer (TNBC) and non-TNBC tissue samples. The pre-processed dataset consists of
    MeSH term(s) Humans ; Triple Negative Breast Neoplasms/genetics
    Language English
    Publishing date 2022-01-24
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 1136948-6
    ISSN 1477-0334 ; 0962-2802
    ISSN (online) 1477-0334
    ISSN 0962-2802
    DOI 10.1177/09622802211072456
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  6. Article ; Online: Using Markov chains and temporal alignment to identify clinical patterns in Dementia.

    Costa, Luísa Marote / Colaço, João / Carvalho, Alexandra M / Vinga, Susana / Teixeira, Andreia Sofia

    Journal of biomedical informatics

    2023  Volume 140, Page(s) 104328

    Abstract: In the healthcare sector, resorting to big data and advanced analytics is a great advantage when dealing with complex groups of patients in terms of comorbidities, representing a significant step towards personalized targeting. In this work, we focus on ... ...

    Abstract In the healthcare sector, resorting to big data and advanced analytics is a great advantage when dealing with complex groups of patients in terms of comorbidities, representing a significant step towards personalized targeting. In this work, we focus on understanding key features and clinical pathways of patients with multimorbidity suffering from Dementia. This disease can result from many heterogeneous factors, potentially becoming more prevalent as the population ages. We present a set of methods that allow us to identify medical appointment patterns within a cohort of 1924 patients followed from January 2007 to August 2021 in Hospital da Luz (Lisbon), and to stratify patients into subgroups that exhibit similar patterns of interaction. With Markov Chains, we are able to identify the most prevailing medical appointments attended by Dementia patients, as well as recurring transitions between these. To perform patient stratification, we applied AliClu, a temporal sequence alignment algorithm for clustering longitudinal clinical data, which allowed us to successfully identify patient subgroups with similar medical appointment activity. A feature analysis per cluster obtained allows the identification of distinct patterns and characteristics. This pipeline provides a tool to identify prevailing clinical pathways of medical appointments within the dataset, as well as the most common transitions between medical specialities within Dementia patients. This methodology, alongside demographic and clinical data, has the potential to provide early signalling of the most likely clinical pathways and serve as a support tool for health providers in deciding the best course of treatment, considering a patient as a whole.
    MeSH term(s) Humans ; Markov Chains ; Comorbidity ; Multimorbidity ; Algorithms ; Dementia/diagnosis
    Language English
    Publishing date 2023-03-14
    Publishing country United States
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 2057141-0
    ISSN 1532-0480 ; 1532-0464
    ISSN (online) 1532-0480
    ISSN 1532-0464
    DOI 10.1016/j.jbi.2023.104328
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  7. Article ; Online: Kidney Cancer Biomarker Selection Using Regularized Survival Models

    Carolina Peixoto / Marta Martins / Luís Costa / Susana Vinga

    Cells, Vol 11, Iss 2311, p

    2022  Volume 2311

    Abstract: Clear cell renal cell carcinoma (ccRCC) is the most common subtype of RCC showing a significant percentage of mortality. One of the priorities of kidney cancer research is to identify RCC-specific biomarkers for early detection and screening of the ... ...

    Abstract Clear cell renal cell carcinoma (ccRCC) is the most common subtype of RCC showing a significant percentage of mortality. One of the priorities of kidney cancer research is to identify RCC-specific biomarkers for early detection and screening of the disease. With the development of high-throughput technology, it is now possible to measure the expression levels of thousands of genes in parallel and assess the molecular profile of individual tumors. Studying the relationship between gene expression and survival outcome has been widely used to find genes associated with cancer survival, providing new information for clinical decision-making. One of the challenges of using transcriptomics data is their high dimensionality which can lead to instability in the selection of gene signatures. Here we identify potential prognostic biomarkers correlated to the survival outcome of ccRCC patients using two network-based regularizers (EN and TCox) applied to Cox models. Some genes always selected by each method were found ( COPS7B, DONSON, GTF2E2, HAUS8, PRH2 , and ZNF18 ) with known roles in cancer formation and progression. Afterward, different lists of genes ranked based on distinct metrics (logFC of DEGs or <math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>β</mi></semantics></math> coefficients of regression) were analyzed using GSEA to try to find over- or under-represented mechanisms and pathways. Some ontologies were found in common between the gene sets tested, such as nuclear division, microtubule and tubulin binding, and plasma membrane and chromosome regions. Additionally, genes that were more involved in these ontologies and genes selected by the regularizers were used to create a new gene set where we applied the Cox regression model. With this smaller gene set, we were able to significantly split patients into high/low risk groups showing the importance of studying these genes as potential prognostic factors to help clinicians better identify and ...
    Keywords kidney cancer ; regularization ; Cox regression ; biomarker selection ; gene ontology ; Biology (General) ; QH301-705.5
    Subject code 616
    Language English
    Publishing date 2022-07-01T00:00:00Z
    Publisher MDPI AG
    Document type Article ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  8. Article: Coupling sparse Cox models with clustering of longitudinal transcriptomics data for trauma prognosis.

    Constantino, Cláudia S / Carvalho, Alexandra M / Vinga, Susana

    BioData mining

    2021  Volume 14, Issue 1, Page(s) 25

    Abstract: Background: Longitudinal gene expression analysis and survival modeling have been proved to add valuable biological and clinical knowledge. This study proposes a novel framework to discover gene signatures and patterns in a high-dimensional time series ... ...

    Abstract Background: Longitudinal gene expression analysis and survival modeling have been proved to add valuable biological and clinical knowledge. This study proposes a novel framework to discover gene signatures and patterns in a high-dimensional time series transcriptomics data and to assess their association with hospital length of stay.
    Methods: We investigated a longitudinal and high-dimensional gene expression dataset from 168 blunt-force trauma patients followed during the first 28 days after injury. To model the length of stay, an initial dimensionality reduction step was performed by applying Cox regression with elastic net regularization using gene expression data from the first hospitalization days. Also, a novel methodology to impute missing values to the genes selected previously was proposed. We then applied multivariate time series (MTS) clustering to analyse gene expression over time and to stratify patients with similar trajectories. The validation of the patients' partitions obtained by MTS clustering was performed using Kaplan-Meier curves and log-rank tests.
    Results: We were able to unravel 22 genes strongly associated with hospital's discharge. Their expression values in the first days after trauma showed to be good predictors of the length of stay. The proposed mixed imputation method allowed to achieve a complete dataset of short time series with a minimum loss of information for the 28 days of follow-up. MTS clustering enabled to group patients with similar genes trajectories and, notably, with similar discharge days from the hospital. Patients within each cluster have comparable genes' trajectories and may have an analogous response to injury.
    Conclusion: The proposed framework was able to tackle the joint analysis of time-to-event information with longitudinal multivariate high-dimensional data. The application to length of stay and transcriptomics data revealed a strong relationship between gene expression trajectory and patients' recovery, which may improve trauma patient's management by healthcare systems. The proposed methodology can be easily adapted to other medical data, towards more effective clinical decision support systems for health applications.
    Language English
    Publishing date 2021-04-14
    Publishing country England
    Document type Journal Article
    ZDB-ID 2438773-3
    ISSN 1756-0381
    ISSN 1756-0381
    DOI 10.1186/s13040-021-00257-8
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  9. Article ; Online: Tracking intratumoral heterogeneity in glioblastoma via regularized classification of single-cell RNA-Seq data

    Marta B. Lopes / Susana Vinga

    BMC Bioinformatics, Vol 21, Iss 1, Pp 1-

    2020  Volume 12

    Abstract: Abstract Background Understanding cellular and molecular heterogeneity in glioblastoma (GBM), the most common and aggressive primary brain malignancy, is a crucial step towards the development of effective therapies. Besides the inter-patient variability, ...

    Abstract Abstract Background Understanding cellular and molecular heterogeneity in glioblastoma (GBM), the most common and aggressive primary brain malignancy, is a crucial step towards the development of effective therapies. Besides the inter-patient variability, the presence of multiple cell populations within tumors calls for the need to develop modeling strategies able to extract the molecular signatures driving tumor evolution and treatment failure. With the advances in single-cell RNA Sequencing (scRNA-Seq), tumors can now be dissected at the cell level, unveiling information from their life history to their clinical implications. Results We propose a classification setting based on GBM scRNA-Seq data, through sparse logistic regression, where different cell populations (neoplastic and normal cells) are taken as classes. The goal is to identify gene features discriminating between the classes, but also those shared by different neoplastic clones. The latter will be approached via the network-based twiner regularizer to identify gene signatures shared by neoplastic cells from the tumor core and infiltrating neoplastic cells originated from the tumor periphery, as putative disease biomarkers to target multiple neoplastic clones. Our analysis is supported by the literature through the identification of several known molecular players in GBM. Moreover, the relevance of the selected genes was confirmed by their significance in the survival outcomes in bulk GBM RNA-Seq data, as well as their association with several Gene Ontology (GO) biological process terms. Conclusions We presented a methodology intended to identify genes discriminating between GBM clones, but also those playing a similar role in different GBM neoplastic clones (including migrating cells), therefore potential targets for therapy research. Our results contribute to a deeper understanding on the genetic features behind GBM, by disclosing novel therapeutic directions accounting for GBM heterogeneity.
    Keywords Glioblastoma ; Sparse logistic regression ; Gene network ; Twiner ; Computer applications to medicine. Medical informatics ; R858-859.7 ; Biology (General) ; QH301-705.5
    Language English
    Publishing date 2020-02-01T00:00:00Z
    Publisher BMC
    Document type Article ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  10. Article ; Online: Information theory applications for biological sequence analysis.

    Vinga, Susana

    Briefings in bioinformatics

    2013  Volume 15, Issue 3, Page(s) 376–389

    Abstract: Information theory (IT) addresses the analysis of communication systems and has been widely applied in molecular biology. In particular, alignment-free sequence analysis and comparison greatly benefited from concepts derived from IT, such as entropy and ... ...

    Abstract Information theory (IT) addresses the analysis of communication systems and has been widely applied in molecular biology. In particular, alignment-free sequence analysis and comparison greatly benefited from concepts derived from IT, such as entropy and mutual information. This review covers several aspects of IT applications, ranging from genome global analysis and comparison, including block-entropy estimation and resolution-free metrics based on iterative maps, to local analysis, comprising the classification of motifs, prediction of transcription factor binding sites and sequence characterization based on linguistic complexity and entropic profiles. IT has also been applied to high-level correlations that combine DNA, RNA or protein features with sequence-independent properties, such as gene mapping and phenotype analysis, and has also provided models based on communication systems theory to describe information transmission channels at the cell level and also during evolutionary processes. While not exhaustive, this review attempts to categorize existing methods and to indicate their relation with broader transversal topics such as genomic signatures, data compression and complexity, time series analysis and phylogenetic classification, providing a resource for future developments in this promising area.
    MeSH term(s) Binding Sites/genetics ; Computational Biology/methods ; Genomics/methods ; Genomics/statistics & numerical data ; Humans ; Information Theory ; Models, Statistical ; Nonlinear Dynamics ; Phylogeny ; Saccharomyces cerevisiae/genetics ; Sequence Alignment ; Sequence Analysis/methods ; Sequence Analysis/statistics & numerical data ; Software ; Transcription Factors/metabolism
    Chemical Substances Transcription Factors
    Keywords covid19
    Language English
    Publishing date 2013-09-20
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't ; Review
    ZDB-ID 2068142-2
    ISSN 1477-4054 ; 1467-5463
    ISSN (online) 1477-4054
    ISSN 1467-5463
    DOI 10.1093/bib/bbt068
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

To top