LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Your last searches

  1. AU="Balasubramanian, Jeya Balaji"
  2. AU="Soldano, Stefano"
  3. AU="Van Os, Jim"
  4. AU="Wang, Keshi"
  5. AU="Choi, Chi Young"
  6. AU="Mizen, Sara J"
  7. AU="Couttas, Timothy A"
  8. AU="Mendlovic, Joseph"
  9. AU="Amrhein, Carolin"
  10. AU="Türk, İlteriş"
  11. AU="Bell, T J"
  12. AU="Rapisarda, Annamaria"
  13. AU="Lambert, Finnlay"
  14. AU="Patel, Gourang"
  15. AU="Dey, S.R."
  16. AU="Yusuf, Mubarak H"
  17. AU="Biggs, Brandon"
  18. AU="Dardé, Marie-Laure"
  19. AU="Kumari, Kamlesh"
  20. AU="Jansen, Erica C"
  21. AU="Timpson, N J"
  22. AU="Nathan, Ashwin S"
  23. AU="Lovestone, S"
  24. AU="Dass, Bhagwan"
  25. AU="Suhlrie, Adriana"
  26. AU="Palència, Laia"
  27. AU="Crump, Michael"
  28. AU="Noyori, Osamu"
  29. AU="Atibordee Meesing"

Search results

Result 1 - 10 of total 10

Search options

  1. Article: Tunable structure priors for Bayesian rule learning for knowledge integrated biomarker discovery.

    Balasubramanian, Jeya Balaji / Gopalakrishnan, Vanathi

    World journal of clinical oncology

    2018  Volume 9, Issue 5, Page(s) 98–109

    Abstract: Aim: To develop a framework to incorporate background domain knowledge into classification rule learning for knowledge discovery in biomedicine.: Methods: Bayesian rule learning (BRL) is a rule-based classifier that uses a greedy best-first search ... ...

    Abstract Aim: To develop a framework to incorporate background domain knowledge into classification rule learning for knowledge discovery in biomedicine.
    Methods: Bayesian rule learning (BRL) is a rule-based classifier that uses a greedy best-first search over a space of Bayesian belief-networks (BN) to find the optimal BN to explain the input dataset, and then infers classification rules from this BN. BRL uses a Bayesian score to evaluate the quality of BNs. In this paper, we extended the Bayesian score to include informative structure priors, which encodes our prior domain knowledge about the dataset. We call this extension of BRL as BRL
    Results: We evaluated the degree of incorporation of prior knowledge into BRL
    Conclusion: BRL
    Language English
    Publishing date 2018-08-23
    Publishing country United States
    Document type Journal Article
    ZDB-ID 2587357-X
    ISSN 2218-4333
    ISSN 2218-4333
    DOI 10.5306/wjco.v9.i5.98
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  2. Article ; Online: A novel approach to modeling multifactorial diseases using Ensemble Bayesian Rule classifiers.

    Balasubramanian, Jeya Balaji / Boes, Rebecca D / Gopalakrishnan, Vanathi

    Journal of biomedical informatics

    2020  Volume 107, Page(s) 103455

    Abstract: Modeling factors influencing disease phenotypes, from biomarker profiling study datasets, is a critical task in biomedicine. Such datasets are typically generated from high-throughput 'omic' technologies, which help examine disease mechanisms at an ... ...

    Abstract Modeling factors influencing disease phenotypes, from biomarker profiling study datasets, is a critical task in biomedicine. Such datasets are typically generated from high-throughput 'omic' technologies, which help examine disease mechanisms at an unprecedented resolution. These datasets are challenging because they are high-dimensional. The disease mechanisms they study are also complex because many diseases are multifactorial, resulting from the collective activity of several factors, each with a small effect. Bayesian rule learning (BRL) is a rule model inferred from learning Bayesian networks from data, and has been shown to be effective in modeling high-dimensional datasets. However, BRL is not efficient at modeling multifactorial diseases since it suffers from data fragmentation during learning. In this paper, we overcome this limitation by implementing and evaluating three types of ensemble model combination strategies with BRL- uniform combination (UC; same as Bagging), Bayesian model averaging (BMA), and Bayesian model combination (BMC)- collectively called Ensemble Bayesian Rule Learning (EBRL). We also introduce a novel method to visualize EBRL models, called the Bayesian Rule Ensemble Visualizing tool (BREVity), which helps extract interpret the most important rule patterns guiding the predictions made by the ensemble model. Our results using twenty-five public, high-dimensional, gene expression datasets of multifactorial diseases, suggest that, both EBRL models using UC and BMC achieve better predictive performance than BMA and other classic machine learning methods. Furthermore, BMC is found to be more reliable than UC, when the ensemble includes sub-optimal models resulting from the stochasticity of the model search process. Together, EBRL and BREVity provides researchers a promising and novel tool for modeling multifactorial diseases from high-dimensional datasets that leverages strengths of ensemble methods for predictive performance, while also providing interpretable explanations for its predictions.
    MeSH term(s) Bayes Theorem ; Machine Learning
    Language English
    Publishing date 2020-06-01
    Publishing country United States
    Document type Journal Article ; Research Support, N.I.H., Extramural ; Research Support, Non-U.S. Gov't
    ZDB-ID 2057141-0
    ISSN 1532-0480 ; 1532-0464
    ISSN (online) 1532-0480
    ISSN 1532-0464
    DOI 10.1016/j.jbi.2020.103455
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  3. Article ; Online: Wasm-iCARE: a portable and privacy-preserving web module to build, validate, and apply absolute risk models.

    Balasubramanian, Jeya Balaji / Choudhury, Parichoy Pal / Mukhopadhyay, Srijon / Ahearn, Thomas / Chatterjee, Nilanjan / García-Closas, Montserrat / Almeida, Jonas S

    ArXiv

    2023  

    Abstract: Objective: Absolute risk models estimate an individual's future disease risk over a specified time interval. Applications utilizing server-side risk tooling, such as the R-based iCARE (R-iCARE), to build, validate, and apply absolute risk models, face ... ...

    Abstract Objective: Absolute risk models estimate an individual's future disease risk over a specified time interval. Applications utilizing server-side risk tooling, such as the R-based iCARE (R-iCARE), to build, validate, and apply absolute risk models, face serious limitations in portability and privacy due to their need for circulating user data in remote servers for operation. Our objective was to overcome these limitations.
    Materials and methods: We refactored R-iCARE into a Python package (Py-iCARE) then compiled it to WebAssembly (Wasm-iCARE): a portable web module, which operates entirely within the privacy of the user's device.
    Results: We showcase the portability and privacy of Wasm-iCARE through two applications: for researchers to statistically validate risk models, and to deliver them to end-users. Both applications run entirely on the client-side, requiring no downloads or installations, and keeps user data on-device during risk calculation.
    Conclusions: Wasm-iCARE fosters accessible and privacy-preserving risk tools, accelerating their validation and delivery.
    Language English
    Publishing date 2023-10-13
    Publishing country United States
    Document type Preprint
    ISSN 2331-8422
    ISSN (online) 2331-8422
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  4. Book ; Online: Wasm-iCARE

    Balasubramanian, Jeya Balaji / Choudhury, Parichoy Pal / Mukhopadhyay, Srijon / Ahearn, Thomas / Chatterjee, Nilanjan / García-Closas, Montserrat / Almeida, Jonas S.

    a portable and privacy-preserving web module to build, validate, and apply absolute risk models

    2023  

    Abstract: Objective: Absolute risk models estimate an individual's future disease risk over a specified time interval. Applications utilizing server-side risk tooling, such as the R-based iCARE (R-iCARE), to build, validate, and apply absolute risk models, face ... ...

    Abstract Objective: Absolute risk models estimate an individual's future disease risk over a specified time interval. Applications utilizing server-side risk tooling, such as the R-based iCARE (R-iCARE), to build, validate, and apply absolute risk models, face serious limitations in portability and privacy due to their need for circulating user data in remote servers for operation. Our objective was to overcome these limitations. Materials and Methods: We refactored R-iCARE into a Python package (Py-iCARE) then compiled it to WebAssembly (Wasm-iCARE): a portable web module, which operates entirely within the privacy of the user's device. Results: We showcase the portability and privacy of Wasm-iCARE through two applications: for researchers to statistically validate risk models, and to deliver them to end-users. Both applications run entirely on the client-side, requiring no downloads or installations, and keeps user data on-device during risk calculation. Conclusions: Wasm-iCARE fosters accessible and privacy-preserving risk tools, accelerating their validation and delivery.

    Comment: 10 pages, 2 figures
    Keywords Quantitative Biology - Quantitative Methods
    Subject code 005 ; 330
    Publishing date 2023-10-13
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  5. Article ; Online: PRScalc, a privacy-preserving calculation of raw polygenic risk scores from direct-to-consumer genomics data.

    Sandoval, Lorena / Jafri, Saleet / Balasubramanian, Jeya Balaji / Bhawsar, Praphulla / Edelson, Jacob L / Martins, Yasmmin / Maass, Wolfgang / Chanock, Stephen J / Garcia-Closas, Montserrat / Almeida, Jonas S

    Bioinformatics advances

    2023  Volume 3, Issue 1, Page(s) vbad145

    Abstract: Motivation: Currently, the Polygenic Score (PGS) Catalog curates over 400 publications on over 500 traits corresponding to over 3000 polygenic risk scores (PRSs). To assess the feasibility of privately calculating the underlying multivariate relative ... ...

    Abstract Motivation: Currently, the Polygenic Score (PGS) Catalog curates over 400 publications on over 500 traits corresponding to over 3000 polygenic risk scores (PRSs). To assess the feasibility of privately calculating the underlying multivariate relative risk for individuals with consumer genomics data, we developed an in-browserPRS calculator for genomic data that does not circulate any data or engage in any computation outside of the user's personal device.
    Results: A prototype personal risk score calculator, created for research purposes, was developed to demonstrate how the PGS Catalog can be privately and readily applied to readily available direct-to-consumer genetic testing services, such as 23andMe. No software download, installation, or configuration is needed. The PRS web calculator matches individual PGS catalog entries with an individual's 23andMe genome data composed of 600k to 1.4 M single-nucleotide polymorphisms (SNPs). Beta coefficients provide researchers with a convenient assessment of risk associated with matched SNPs. This in-browser application was tested in a variety of personal devices, including smartphones, establishing the feasibility of privately calculating personal risk scores with up to a few thousand reference genetic variations and from the full 23andMe SNP data file (compressed or not).
    Availability and implementation: The PRScalc web application is developed in JavaScript, HTML, and CSS and is available at GitHub repository (https://episphere.github.io/prs) under an MIT license. The datasets were derived from sources in the public domain: [PGS Catalog, Personal Genome Project].
    Language English
    Publishing date 2023-10-09
    Publishing country England
    Document type Journal Article
    ISSN 2635-0041
    ISSN (online) 2635-0041
    DOI 10.1093/bioadv/vbad145
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  6. Article: Learning Parsimonious Classification Rules from Gene Expression Data Using Bayesian Networks with Local Structure.

    Lustgarten, Jonathan Lyle / Balasubramanian, Jeya Balaji / Visweswaran, Shyam / Gopalakrishnan, Vanathi

    Data

    2017  Volume 2, Issue 1

    Abstract: The comprehensibility of good predictive models learned from high-dimensional gene expression data is attractive because it can lead to biomarker discovery. Several good classifiers provide comparable predictive performance but differ in their abilities ... ...

    Abstract The comprehensibility of good predictive models learned from high-dimensional gene expression data is attractive because it can lead to biomarker discovery. Several good classifiers provide comparable predictive performance but differ in their abilities to summarize the observed data. We extend a Bayesian Rule Learning (BRL-GSS) algorithm, previously shown to be a significantly better predictor than other classical approaches in this domain. It searches a space of Bayesian networks using a decision tree representation of its parameters with global constraints, and infers a set of IF-THEN rules. The number of parameters and therefore the number of rules are combinatorial to the number of predictor variables in the model. We relax these global constraints to a more generalizable local structure (BRL-LSS). BRL-LSS entails more parsimonious set of rules because it does not have to generate all combinatorial rules. The search space of local structures is much richer than the space of global structures. We design the BRL-LSS with the same worst-case time-complexity as BRL-GSS while exploring a richer and more complex model space. We measure predictive performance using Area Under the ROC curve (AUC) and Accuracy. We measure model parsimony performance by noting the average number of rules and variables needed to describe the observed data. We evaluate the predictive and parsimony performance of BRL-GSS, BRL-LSS and the state-of-the-art C4.5 decision tree algorithm, across 10-fold cross-validation using ten microarray gene-expression diagnostic datasets. In these experiments, we observe that BRL-LSS is similar to BRL-GSS in terms of predictive performance, while generating a much more parsimonious set of rules to explain the same observed data. BRL-LSS also needs fewer variables than C4.5 to explain the data with similar predictive performance. We also conduct a feasibility study to demonstrate the general applicability of our BRL methods on the newer RNA sequencing gene-expression data.
    Language English
    Publishing date 2017-01-18
    Publishing country Switzerland
    Document type Journal Article
    ISSN 2306-5729
    ISSN 2306-5729
    DOI 10.3390/data2010005
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  7. Article ; Online: Moving Toward Findable, Accessible, Interoperable, Reusable Practices in Epidemiologic Research.

    García-Closas, Montserrat / Ahearn, Thomas U / Gaudet, Mia M / Hurson, Amber N / Balasubramanian, Jeya Balaji / Choudhury, Parichoy Pal / Gerlanc, Nicole M / Patel, Bhaumik / Russ, Daniel / Abubakar, Mustapha / Freedman, Neal D / Wong, Wendy S W / Chanock, Stephen J / Berrington de Gonzalez, Amy / Almeida, Jonas S

    American journal of epidemiology

    2023  Volume 192, Issue 6, Page(s) 995–1005

    Abstract: Data sharing is essential for reproducibility of epidemiologic research, replication of findings, pooled analyses in consortia efforts, and maximizing study value to address multiple research questions. However, barriers related to confidentiality, costs, ...

    Abstract Data sharing is essential for reproducibility of epidemiologic research, replication of findings, pooled analyses in consortia efforts, and maximizing study value to address multiple research questions. However, barriers related to confidentiality, costs, and incentives often limit the extent and speed of data sharing. Epidemiological practices that follow Findable, Accessible, Interoperable, Reusable (FAIR) principles can address these barriers by making data resources findable with the necessary metadata, accessible to authorized users, and interoperable with other data, to optimize the reuse of resources with appropriate credit to its creators. We provide an overview of these principles and describe approaches for implementation in epidemiology. Increasing degrees of FAIRness can be achieved by moving data and code from on-site locations to remote, accessible ("Cloud") data servers, using machine-readable and nonproprietary files, and developing open-source code. Adoption of these practices will improve daily work and collaborative analyses and facilitate compliance with data sharing policies from funders and scientific journals. Achieving a high degree of FAIRness will require funding, training, organizational support, recognition, and incentives for sharing research resources, both data and code. However, these costs are outweighed by the benefits of making research more reproducible, impactful, and equitable by facilitating the reuse of precious research resources by the scientific community.
    MeSH term(s) Humans ; Reproducibility of Results ; Confidentiality ; Information Dissemination ; Software ; Epidemiologic Studies
    Language English
    Publishing date 2023-02-21
    Publishing country United States
    Document type Journal Article ; Research Support, N.I.H., Intramural
    ZDB-ID 2937-3
    ISSN 1476-6256 ; 0002-9262
    ISSN (online) 1476-6256
    ISSN 0002-9262
    DOI 10.1093/aje/kwad040
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  8. Article: An empirical workflow for genome-wide single nucleotide polymorphism-based predictive modeling.

    Floudas, Charalampos S / Balasubramanian, Jeya Balaji / Romkes, Marjorie / Gopalakrishnan, Vanathi

    AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science

    2013  Volume 2013, Page(s) 53–57

    Abstract: Technology is constantly evolving, necessitating the development of workflows for efficient use of high-dimensional data. We develop and test an empirical workflow for predictive modeling based on single nucleotide polymorphisms (SNP) from genome-wide ... ...

    Abstract Technology is constantly evolving, necessitating the development of workflows for efficient use of high-dimensional data. We develop and test an empirical workflow for predictive modeling based on single nucleotide polymorphisms (SNP) from genome-wide association study (GWAS) datasets. To this aim, we use as a case study SNP-based prediction of survival for non-small cell lung cancer (NSCLC) with a Bayesian rule learner system (BRL+). Lung cancer is a leading cause of mortality. Standard treatment for early stages of NSCLC is surgery. Adjuvant chemotherapy would be beneficial for patients with early recurrence; consequently, we need models capable of such prediction. This workflow outlines the challenges involved in processing GWAS datasets from one popular platform (Affymetrix®), from the results files of the hybridization experiment to the model construction. Our results show that our workflow is feasible and efficient for processing such data while also yielding SNP based models with high predictive accuracy over cross validation.
    Language English
    Publishing date 2013-03-18
    Publishing country United States
    Document type Journal Article
    ZDB-ID 2676378-3
    ISSN 2153-4063
    ISSN 2153-4063
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  9. Book ; Online: Moving towards FAIR practices in epidemiological research

    Garcia-Closas, Montserrat / Ahearn, Thomas U. / Gaudet, Mia M. / Hurson, Amber N. / Balasubramanian, Jeya Balaji / Choudhury, Parichoy Pal / Gerlanc, Nicole M. / Patel, Bhaumik / Russ, Daniel / Abubakar, Mustapha / Freedman, Neal D. / Wong, Wendy S. W. / Chanock, Stephen J. / de Gonzalez, Amy Berrington / Almeida, Jonas S

    2022  

    Abstract: Reproducibility and replicability of research findings are central to the scientific integrity of epidemiology. In addition, many research questions require combiningdata from multiple sources to achieve adequate statistical power. However, barriers ... ...

    Abstract Reproducibility and replicability of research findings are central to the scientific integrity of epidemiology. In addition, many research questions require combiningdata from multiple sources to achieve adequate statistical power. However, barriers related to confidentiality, costs, and incentives often limit the extent and speed of sharing resources, both data and code. Epidemiological practices that follow FAIR principles can address these barriers by making resources (F)indable with the necessary metadata , (A)ccessible to authorized users and (I)nteroperable with other data, to optimize the (R)e-use of resources with appropriate credit to its creators. We provide an overview of these principles and describe approaches for implementation in epidemiology. Increasing degrees of FAIRness can be achieved by moving data and code from on-site locations to the Cloud, using machine-readable and non-proprietary files, and developing open-source code. Adoption of these practices will improve daily work and collaborative analyses, and facilitate compliance with data sharing policies from funders and scientific journals. Achieving a high degree of FAIRness will require funding, training, organizational support, recognition, and incentives for sharing resources. But these costs are amply outweighed by the benefits of making research more reproducible, impactful, and equitable by facilitating the re-use of precious research resources by the scientific community.
    Keywords Quantitative Biology - Populations and Evolution
    Publishing date 2022-06-13
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  10. Article ; Online: On Predicting lung cancer subtypes using 'omic' data from tumor and tumor-adjacent histologically-normal tissue.

    Pineda, Arturo López / Ogoe, Henry Ato / Balasubramanian, Jeya Balaji / Rangel Escareño, Claudia / Visweswaran, Shyam / Herman, James Gordon / Gopalakrishnan, Vanathi

    BMC cancer

    2016  Volume 16, Page(s) 184

    Abstract: Background: Adenocarcinoma (ADC) and squamous cell carcinoma (SCC) are the most prevalent histological types among lung cancers. Distinguishing between these subtypes is critically important because they have different implications for prognosis and ... ...

    Abstract Background: Adenocarcinoma (ADC) and squamous cell carcinoma (SCC) are the most prevalent histological types among lung cancers. Distinguishing between these subtypes is critically important because they have different implications for prognosis and treatment. Normally, histopathological analyses are used to distinguish between the two, where the tissue samples are collected based on small endoscopic samples or needle aspirations. However, the lack of cell architecture in these small tissue samples hampers the process of distinguishing between the two subtypes. Molecular profiling can also be used to discriminate between the two lung cancer subtypes, on condition that the biopsy is composed of at least 50 % of tumor cells. However, for some cases, the tissue composition of a biopsy might be a mix of tumor and tumor-adjacent histologically normal tissue (TAHN). When this happens, a new biopsy is required, with associated cost, risks and discomfort to the patient. To avoid this problem, we hypothesize that a computational method can distinguish between lung cancer subtypes given tumor and TAHN tissue.
    Methods: Using publicly available datasets for gene expression and DNA methylation, we applied four classification tasks, depending on the possible combinations of tumor and TAHN tissue. First, we used a feature selector (ReliefF/Limma) to select relevant variables, which were then used to build a simple naïve Bayes classification model. Then, we evaluated the classification performance of our models by measuring the area under the receiver operating characteristic curve (AUC). Finally, we analyzed the relevance of the selected genes using hierarchical clustering and IPA® software for gene functional analysis.
    Results: All Bayesian models achieved high classification performance (AUC > 0.94), which were confirmed by hierarchical cluster analysis. From the genes selected, 25 (93 %) were found to be related to cancer (19 were associated with ADC or SCC), confirming the biological relevance of our method.
    Conclusions: The results from this study confirm that computational methods using tumor and TAHN tissue can serve as a prognostic tool for lung cancer subtype classification. Our study complements results from other studies where TAHN tissue has been used as prognostic tool for prostate cancer. The clinical implications of this finding could greatly benefit lung cancer patients.
    MeSH term(s) Adenocarcinoma/diagnosis ; Adenocarcinoma/genetics ; Bayes Theorem ; Carcinoma, Squamous Cell/diagnosis ; Carcinoma, Squamous Cell/genetics ; Cluster Analysis ; Computational Biology/methods ; DNA Methylation ; Databases, Nucleic Acid ; Datasets as Topic ; Gene Expression Profiling/methods ; Gene Expression Regulation, Neoplastic ; Gene Regulatory Networks ; Genomics/methods ; Humans ; Lung Neoplasms/diagnosis ; Lung Neoplasms/genetics ; Prognosis ; Reproducibility of Results
    Language English
    Publishing date 2016-03-04
    Publishing country England
    Document type Journal Article ; Research Support, N.I.H., Extramural ; Research Support, Non-U.S. Gov't
    ZDB-ID 2041352-X
    ISSN 1471-2407 ; 1471-2407
    ISSN (online) 1471-2407
    ISSN 1471-2407
    DOI 10.1186/s12885-016-2223-3
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

To top