LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 11

Search options

  1. Article ; Online: Analysis of Fluxomic Experiments with Principal Metabolic Flux Mode Analysis.

    Bhadra, Sahely / Rousu, Juho

    Methods in molecular biology (Clifton, N.J.)

    2018  Volume 1807, Page(s) 141–161

    Abstract: In the analysis of metabolism, two distinct and complementary approaches are frequently used: Principal component analysis (PCA) and stoichiometric flux analysis. PCA is able to capture the main modes of variability in a set of experiments and does not ... ...

    Abstract In the analysis of metabolism, two distinct and complementary approaches are frequently used: Principal component analysis (PCA) and stoichiometric flux analysis. PCA is able to capture the main modes of variability in a set of experiments and does not make many prior assumptions about the data, but does not inherently take into account the flux mode structure of metabolism. Stoichiometric flux analysis methods, such as Flux Balance Analysis (FBA) and Elementary Mode Analysis, on the other hand, are able to capture the metabolic flux modes, however, they are primarily designed for the analysis of single samples at a time, and assume the stoichiometric steady state of the metabolic network.We will discuss a new methodology for the analysis of metabolism, called Principal Metabolic Flux Mode Analysis (PMFA), which marries the PCA and stoichiometric flux analysis approaches in an elegant regularized optimization framework. In short, the method incorporates a variance maximization objective form PCA coupled with a stoichiometric regularizer, which penalizes projections that are far from any flux modes of the network. For interpretability, we also discuss a sparse variant of PMFA that favors flux modes that contain a small number of reactions. PMFA has several benefits: (1) it can be applied to large metabolic network in efficient way as PMFA does not enumerate elementary modes, (2) the method is more robust to the steady-state violations than competing approaches, and (3) can compactly capture the variation in the data by a few factors. This chapter will describe the detailed steps how to do the above task on experimental data from fluxomic and gene expression measurements.
    MeSH term(s) Algorithms ; Metabolic Flux Analysis/methods ; Principal Component Analysis ; Saccharomyces cerevisiae/metabolism
    Language English
    Publishing date 2018-07-20
    Publishing country United States
    Document type Journal Article
    ISSN 1940-6029
    ISSN (online) 1940-6029
    DOI 10.1007/978-1-4939-8561-6_11
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  2. Book ; Online: Blacks is to Anger as Whites is to Joy? Understanding Latent Affective Bias in Large Pre-trained Neural Language Models

    Kadan, Anoop / P., Deepak / Bhadra, Sahely / Gangan, Manjary P. / L, Lajish V.

    2023  

    Abstract: Groundbreaking inventions and highly significant performance improvements in deep learning based Natural Language Processing are witnessed through the development of transformer based large Pre-trained Language Models (PLMs). The wide availability of ... ...

    Abstract Groundbreaking inventions and highly significant performance improvements in deep learning based Natural Language Processing are witnessed through the development of transformer based large Pre-trained Language Models (PLMs). The wide availability of unlabeled data within human generated data deluge along with self-supervised learning strategy helps to accelerate the success of large PLMs in language generation, language understanding, etc. But at the same time, latent historical bias/unfairness in human minds towards a particular gender, race, etc., encoded unintentionally/intentionally into the corpora harms and questions the utility and efficacy of large PLMs in many real-world applications, particularly for the protected groups. In this paper, we present an extensive investigation towards understanding the existence of "Affective Bias" in large PLMs to unveil any biased association of emotions such as anger, fear, joy, etc., towards a particular gender, race or religion with respect to the downstream task of textual emotion detection. We conduct our exploration of affective bias from the very initial stage of corpus level affective bias analysis by searching for imbalanced distribution of affective words within a domain, in large scale corpora that are used to pre-train and fine-tune PLMs. Later, to quantify affective bias in model predictions, we perform an extensive set of class-based and intensity-based evaluations using various bias evaluation corpora. Our results show the existence of statistically significant affective bias in the PLM based emotion detection systems, indicating biased association of certain emotions towards a particular gender, race, and religion.
    Keywords Computer Science - Computation and Language ; Computer Science - Artificial Intelligence
    Subject code 006
    Publishing date 2023-01-21
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  3. Article ; Online: Principal metabolic flux mode analysis.

    Bhadra, Sahely / Blomberg, Peter / Castillo, Sandra / Rousu, Juho

    Bioinformatics (Oxford, England)

    2018  Volume 34, Issue 14, Page(s) 2409–2417

    Abstract: Motivation: In the analysis of metabolism, two distinct and complementary approaches are frequently used: Principal component analysis (PCA) and stoichiometric flux analysis. PCA is able to capture the main modes of variability in a set of experiments ... ...

    Abstract Motivation: In the analysis of metabolism, two distinct and complementary approaches are frequently used: Principal component analysis (PCA) and stoichiometric flux analysis. PCA is able to capture the main modes of variability in a set of experiments and does not make many prior assumptions about the data, but does not inherently take into account the flux mode structure of metabolism. Stoichiometric flux analysis methods, such as Flux Balance Analysis (FBA) and Elementary Mode Analysis, on the other hand, are able to capture the metabolic flux modes, however, they are primarily designed for the analysis of single samples at a time, and not best suited for exploratory analysis on a large sets of samples.
    Results: We propose a new methodology for the analysis of metabolism, called Principal Metabolic Flux Mode Analysis (PMFA), which marries the PCA and stoichiometric flux analysis approaches in an elegant regularized optimization framework. In short, the method incorporates a variance maximization objective form PCA coupled with a stoichiometric regularizer, which penalizes projections that are far from any flux modes of the network. For interpretability, we also introduce a sparse variant of PMFA that favours flux modes that contain a small number of reactions. Our experiments demonstrate the versatility and capabilities of our methodology. The proposed method can be applied to genome-scale metabolic network in efficient way as PMFA does not enumerate elementary modes. In addition, the method is more robust on out-of-steady steady-state experimental data than competing flux mode analysis approaches.
    Availability and implementation: Matlab software for PMFA and SPMFA and dataset used for experiments are available in https://github.com/aalto-ics-kepaco/PMFA.
    Supplementary information: Supplementary data are available at Bioinformatics online.
    MeSH term(s) Metabolic Flux Analysis/methods ; Metabolic Networks and Pathways ; Models, Biological ; Principal Component Analysis ; Software
    Language English
    Publishing date 2018-02-08
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 1422668-6
    ISSN 1367-4811 ; 1367-4803
    ISSN (online) 1367-4811
    ISSN 1367-4803
    DOI 10.1093/bioinformatics/bty049
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  4. Book ; Online: Learning primal-dual sparse kernel machines

    Huusari, Riikka / Bhadra, Sahely / Capponi, Cécile / Kadri, Hachem / Rousu, Juho

    2021  

    Abstract: Traditionally, kernel methods rely on the representer theorem which states that the solution to a learning problem is obtained as a linear combination of the data mapped into the reproducing kernel Hilbert space (RKHS). While elegant from theoretical ... ...

    Abstract Traditionally, kernel methods rely on the representer theorem which states that the solution to a learning problem is obtained as a linear combination of the data mapped into the reproducing kernel Hilbert space (RKHS). While elegant from theoretical point of view, the theorem is prohibitive for algorithms' scalability to large datasets, and the interpretability of the learned function. In this paper, instead of using the traditional representer theorem, we propose to search for a solution in RKHS that has a pre-image decomposition in the original data space, where the elements don't necessarily correspond to the elements in the training set. Our gradient-based optimisation method then hinges on optimising over possibly sparse elements in the input space, and enables us to obtain a kernel-based model with both primal and dual sparsity. We give theoretical justification on the proposed method's generalization ability via a Rademacher bound. Our experiments demonstrate a better scalability and interpretability with accuracy on par with the traditional kernel-based models.
    Keywords Computer Science - Machine Learning
    Subject code 006
    Publishing date 2021-08-27
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  5. Book ; Online: Warping Resilient Scalable Anomaly Detection in Time Series

    S, Abilasha / Bhadra, Sahely / P, Deepak / Mathew, Anish

    2019  

    Abstract: Time series data is ubiquitous in the real-world problems across various domains including healthcare, social media, and crime surveillance. Detecting anomalies, or irregular and rare events, in time series data, can enable us to find abnormal events in ... ...

    Abstract Time series data is ubiquitous in the real-world problems across various domains including healthcare, social media, and crime surveillance. Detecting anomalies, or irregular and rare events, in time series data, can enable us to find abnormal events in any natural phenomena, which may require special treatment. Moreover, labeled instances of anomaly are hard to get in time series data. On the other hand, time series data, due to its nature, often exhibits localized expansions and compressions in the time dimension which is called warping. These two challenges make it hard to detect anomalies in time series as often such warpings could get detected as anomalies erroneously. Our objective is to build an anomaly detection model that is robust to such warping variations. In this paper, we propose a novel unsupervised time series anomaly detection method, WaRTEm-AD, that operates in two stages. Within the key stage of representation learning, we employ data augmentation through bespoke time series operators which are passed through a twin autoencoder architecture to learn warping-robust representations for time series data. Second, adaptations of state-of-the-art anomaly detection methods are employed on the learnt representations to identify anomalies. We will illustrate that WaRTEm-AD is designed to detect two types of time series anomalies: point and sequence anomalies. We compare WaRTEm-AD with the state-of-the-art baselines and establish the effectiveness of our method both in terms of anomaly detection performance and computational efficiency.

    Comment: October 2021: in communication to ECML PKDD Journal Track
    Keywords Computer Science - Machine Learning ; Statistics - Machine Learning
    Subject code 006
    Publishing date 2019-06-12
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  6. Article: Principal metabolic flux mode analysis

    Bhadra, Sahely / Blomberg, Peter / Castillo, Sandra / Rousu, Juho / Wren, Jonathan

    Bioinformatics. 2018 July 15, v. 34, no. 14

    2018  

    Abstract: In the analysis of metabolism, two distinct and complementary approaches are frequently used: Principal component analysis (PCA) and stoichiometric flux analysis. PCA is able to capture the main modes of variability in a set of experiments and does not ... ...

    Abstract In the analysis of metabolism, two distinct and complementary approaches are frequently used: Principal component analysis (PCA) and stoichiometric flux analysis. PCA is able to capture the main modes of variability in a set of experiments and does not make many prior assumptions about the data, but does not inherently take into account the flux mode structure of metabolism. Stoichiometric flux analysis methods, such as Flux Balance Analysis (FBA) and Elementary Mode Analysis, on the other hand, are able to capture the metabolic flux modes, however, they are primarily designed for the analysis of single samples at a time, and not best suited for exploratory analysis on a large sets of samples. We propose a new methodology for the analysis of metabolism, called Principal Metabolic Flux Mode Analysis (PMFA), which marries the PCA and stoichiometric flux analysis approaches in an elegant regularized optimization framework. In short, the method incorporates a variance maximization objective form PCA coupled with a stoichiometric regularizer, which penalizes projections that are far from any flux modes of the network. For interpretability, we also introduce a sparse variant of PMFA that favours flux modes that contain a small number of reactions. Our experiments demonstrate the versatility and capabilities of our methodology. The proposed method can be applied to genome-scale metabolic network in efficient way as PMFA does not enumerate elementary modes. In addition, the method is more robust on out-of-steady steady-state experimental data than competing flux mode analysis approaches. Matlab software for PMFA and SPMFA and dataset used for experiments are available in https://github.com/aalto-ics-kepaco/PMFA. Supplementary data are available at Bioinformatics online.
    Keywords biochemical pathways ; bioinformatics ; computer software ; data collection ; metabolism ; principal component analysis ; variance
    Language English
    Dates of publication 2018-0715
    Size p. 2409-2417.
    Publishing place Oxford University Press
    Document type Article
    ZDB-ID 1468345-3
    ISSN 1460-2059 ; 1367-4811 ; 1367-4803
    ISSN (online) 1460-2059 ; 1367-4811
    ISSN 1367-4803
    DOI 10.1093/bioinformatics/bty049
    Database NAL-Catalogue (AGRICOLA)

    More links

    Kategorien

  7. Article ; Online: A linear programming approach for estimating the structure of a sparse linear genetic network from transcript profiling data

    Chandra Nagasuma R / Bhattacharyya Chiranjib / Bhadra Sahely / Mian I Saira

    Algorithms for Molecular Biology, Vol 4, Iss 1, p

    2009  Volume 5

    Abstract: Abstract Background A genetic network can be represented as a directed graph in which a node corresponds to a gene and a directed edge specifies the direction of influence of one gene on another. The reconstruction of such networks from transcript ... ...

    Abstract Abstract Background A genetic network can be represented as a directed graph in which a node corresponds to a gene and a directed edge specifies the direction of influence of one gene on another. The reconstruction of such networks from transcript profiling data remains an important yet challenging endeavor. A transcript profile specifies the abundances of many genes in a biological sample of interest. Prevailing strategies for learning the structure of a genetic network from high-dimensional transcript profiling data assume sparsity and linearity. Many methods consider relatively small directed graphs, inferring graphs with up to a few hundred nodes. This work examines large undirected graphs representations of genetic networks, graphs with many thousands of nodes where an undirected edge between two nodes does not indicate the direction of influence, and the problem of estimating the structure of such a sparse linear genetic network (SLGN) from transcript profiling data. Results The structure learning task is cast as a sparse linear regression problem which is then posed as a LASSO ( l 1 -constrained fitting) problem and solved finally by formulating a Linear Program (LP). A bound on the Generalization Error of this approach is given in terms of the Leave-One-Out Error. The accuracy and utility of LP-SLGNs is assessed quantitatively and qualitatively using simulated and real data. The Dialogue for Reverse Engineering Assessments and Methods (DREAM) initiative provides gold standard data sets and evaluation metrics that enable and facilitate the comparison of algorithms for deducing the structure of networks. The structures of LP-SLGNs estimated from the I<smcaps>N</smcaps>S<smcaps>ILICO</smcaps>1, I<smcaps>N</smcaps>S<smcaps>ILICO</smcaps>2 and I<smcaps>N</smcaps>S<smcaps>ILICO</smcaps>3 simulated DREAM2 data sets are comparable to those proposed by the first and/or second ranked teams in the DREAM2 competition. The structures of LP-SLGNs ...
    Keywords Biology (General) ; QH301-705.5 ; Genetics ; QH426-470
    Subject code 006
    Language English
    Publishing date 2009-02-01T00:00:00Z
    Publisher BMC
    Document type Article ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  8. Article ; Online: A linear programming approach for estimating the structure of a sparse linear genetic network from transcript profiling data.

    Bhadra, Sahely / Bhattacharyya, Chiranjib / Chandra, Nagasuma R / Mian, I Saira

    Algorithms for molecular biology : AMB

    2009  Volume 4, Page(s) 5

    Abstract: Background: A genetic network can be represented as a directed graph in which a node corresponds to a gene and a directed edge specifies the direction of influence of one gene on another. The reconstruction of such networks from transcript profiling ... ...

    Abstract Background: A genetic network can be represented as a directed graph in which a node corresponds to a gene and a directed edge specifies the direction of influence of one gene on another. The reconstruction of such networks from transcript profiling data remains an important yet challenging endeavor. A transcript profile specifies the abundances of many genes in a biological sample of interest. Prevailing strategies for learning the structure of a genetic network from high-dimensional transcript profiling data assume sparsity and linearity. Many methods consider relatively small directed graphs, inferring graphs with up to a few hundred nodes. This work examines large undirected graphs representations of genetic networks, graphs with many thousands of nodes where an undirected edge between two nodes does not indicate the direction of influence, and the problem of estimating the structure of such a sparse linear genetic network (SLGN) from transcript profiling data.
    Results: The structure learning task is cast as a sparse linear regression problem which is then posed as a LASSO (l1-constrained fitting) problem and solved finally by formulating a Linear Program (LP). A bound on the Generalization Error of this approach is given in terms of the Leave-One-Out Error. The accuracy and utility of LP-SLGNs is assessed quantitatively and qualitatively using simulated and real data. The Dialogue for Reverse Engineering Assessments and Methods (DREAM) initiative provides gold standard data sets and evaluation metrics that enable and facilitate the comparison of algorithms for deducing the structure of networks. The structures of LP-SLGNs estimated from the INSILICO1, INSILICO2 and INSILICO3 simulated DREAM2 data sets are comparable to those proposed by the first and/or second ranked teams in the DREAM2 competition. The structures of LP-SLGNs estimated from two published Saccharomyces cerevisae cell cycle transcript profiling data sets capture known regulatory associations. In each S. cerevisiae LP-SLGN, the number of nodes with a particular degree follows an approximate power law suggesting that its degree distributions is similar to that observed in real-world networks. Inspection of these LP-SLGNs suggests biological hypotheses amenable to experimental verification.
    Conclusion: A statistically robust and computationally efficient LP-based method for estimating the topology of a large sparse undirected graph from high-dimensional data yields representations of genetic networks that are biologically plausible and useful abstractions of the structures of real genetic networks. Analysis of the statistical and topological properties of learned LP-SLGNs may have practical value; for example, genes with high random walk betweenness, a measure of the centrality of a node in a graph, are good candidates for intervention studies and hence integrated computational - experimental investigations designed to infer more realistic and sophisticated probabilistic directed graphical model representations of genetic networks. The LP-based solutions of the sparse linear regression problem described here may provide a method for learning the structure of transcription factor networks from transcript profiling and transcription factor binding motif data.
    Language English
    Publishing date 2009-02-24
    Publishing country England
    Document type Journal Article
    ISSN 1748-7188
    ISSN (online) 1748-7188
    DOI 10.1186/1748-7188-4-5
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  9. Article ; Online: Prediction of candidate primary immunodeficiency disease genes using a support vector machine learning approach.

    Keerthikumar, Shivakumar / Bhadra, Sahely / Kandasamy, Kumaran / Raju, Rajesh / Ramachandra, Y L / Bhattacharyya, Chiranjib / Imai, Kohsuke / Ohara, Osamu / Mohan, Sujatha / Pandey, Akhilesh

    DNA research : an international journal for rapid publication of reports on genes and genomes

    2009  Volume 16, Issue 6, Page(s) 345–351

    Abstract: Screening and early identification of primary immunodeficiency disease (PID) genes is a major challenge for physicians. Many resources have catalogued molecular alterations in known PID genes along with their associated clinical and immunological ... ...

    Abstract Screening and early identification of primary immunodeficiency disease (PID) genes is a major challenge for physicians. Many resources have catalogued molecular alterations in known PID genes along with their associated clinical and immunological phenotypes. However, these resources do not assist in identifying candidate PID genes. We have recently developed a platform designated Resource of Asian PDIs, which hosts information pertaining to molecular alterations, protein-protein interaction networks, mouse studies and microarray gene expression profiling of all known PID genes. Using this resource as a discovery tool, we describe the development of an algorithm for prediction of candidate PID genes. Using a support vector machine learning approach, we have predicted 1442 candidate PID genes using 69 binary features of 148 known PID genes and 3162 non-PID genes as a training data set. The power of this approach is illustrated by the fact that six of the predicted genes have recently been experimentally confirmed to be PID genes. The remaining genes in this predicted data set represent attractive candidates for testing in patients where the etiology cannot be ascribed to any of the known PID genes.
    MeSH term(s) Algorithms ; Asia ; Computational Biology/methods ; Databases, Genetic ; Genetic Predisposition to Disease ; Humans ; Immunologic Deficiency Syndromes/genetics ; Predictive Value of Tests ; Proteins/genetics ; Proteins/metabolism ; Sensitivity and Specificity
    Chemical Substances Proteins
    Language English
    Publishing date 2009-10-03
    Publishing country England
    Document type Evaluation Study ; Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 1212508-8
    ISSN 1756-1663 ; 1340-2838
    ISSN (online) 1756-1663
    ISSN 1340-2838
    DOI 10.1093/dnares/dsp019
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  10. Article ; Online: RAPID: Resource of Asian Primary Immunodeficiency Diseases.

    Keerthikumar, Shivakumar / Raju, Rajesh / Kandasamy, Kumaran / Hijikata, Atsushi / Ramabadran, Subhashri / Balakrishnan, Lavanya / Ahmed, Mukhtar / Rani, Sandhya / Selvan, Lakshmi Dhevi N / Somanathan, Devi S / Ray, Somak / Bhattacharjee, Mitali / Gollapudi, Sashikanth / Ramachandra, Y L / Bhadra, Sahely / Bhattacharyya, Chiranjib / Imai, Kohsuke / Nonoyama, Shigeaki / Kanegane, Hirokazu /
    Miyawaki, Toshio / Pandey, Akhilesh / Ohara, Osamu / Mohan, Sujatha

    Nucleic acids research

    2009  Volume 37, Issue Database issue, Page(s) D863–7

    Abstract: Availability of a freely accessible, dynamic and integrated database for primary immunodeficiency diseases (PID) is important both for researchers as well as clinicians. To build a PID informational platform and also as a part of action to initiate a ... ...

    Abstract Availability of a freely accessible, dynamic and integrated database for primary immunodeficiency diseases (PID) is important both for researchers as well as clinicians. To build a PID informational platform and also as a part of action to initiate a network of PID research in Asia, we have constructed a web-based compendium of molecular alterations in PID, named Resource of Asian Primary Immunodeficiency Diseases (RAPID), which is available as a worldwide web resource at http://rapid.rcai.riken.jp/. It hosts information on sequence variations and expression at the mRNA and protein levels of all genes reported to be involved in PID patients. The main objective of this database is to provide detailed information pertaining to genes and proteins involved in primary immunodeficiency diseases along with other relevant information about protein-protein interactions, mouse studies and microarray gene-expression profiles in various organs and cells of the immune system. RAPID also hosts a tool, mutation viewer, to predict deleterious and novel mutations and also to obtain mutation-based 3D structures for PID genes. Thus, information contained in this database should help physicians and other biomedical investigators to further investigate the role of these molecules in PID.
    MeSH term(s) Animals ; Asia ; Databases, Genetic ; Gene Expression Profiling ; Humans ; Immunologic Deficiency Syndromes/genetics ; Immunologic Deficiency Syndromes/metabolism ; Mice ; Mutation ; Proteins/genetics ; Proteins/metabolism ; RNA, Messenger/chemistry ; RNA, Messenger/metabolism
    Chemical Substances Proteins ; RNA, Messenger
    Language English
    Publishing date 2009-01
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 2205588-5
    ISSN 1362-4962 ; 1746-8272 ; 0305-1048 ; 0261-3166
    ISSN (online) 1362-4962 ; 1746-8272
    ISSN 0305-1048 ; 0261-3166
    DOI 10.1093/nar/gkn682
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

To top