LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 15

Search options

  1. Article ; Online: FedDNA

    Shuwen Wang / Xingquan Zhu

    PLoS ONE, Vol 18, Iss 7, p e

    Federated learning using dynamic node alignment.

    2023  Volume 0288157

    Abstract: Federated Learning (FL), as a new computing framework, has received significant attentions recently due to its advantageous in preserving data privacy in training models with superb performance. During FL learning, distributed sites first learn ... ...

    Abstract Federated Learning (FL), as a new computing framework, has received significant attentions recently due to its advantageous in preserving data privacy in training models with superb performance. During FL learning, distributed sites first learn respective parameters. A central site will consolidate learned parameters, using average or other approaches, and disseminate new weights across all sites to carryout next round of learning. The distributed parameter learning and consolidation repeat in an iterative fashion until the algorithm converges or terminates. Many FL methods exist to aggregate weights from distributed sites, but most approaches use a static node alignment approach, where nodes of distributed networks are statically assigned, in advance, to match nodes and aggregate their weights. In reality, neural networks, especially dense networks, have nontransparent roles with respect to individual nodes. Combined with random nature of the networks, static node matching often does not result in best matching between nodes across sites. In this paper, we propose, FedDNA, a dynamic node alignment federated learning algorithm. Our theme is to find best matching nodes between different sites, and then aggregate weights of matching nodes for federated learning. For each node in a neural network, we represent its weight values as a vector, and use a distance function to find most similar nodes, i.e., nodes with the smallest distance from other sides. Because finding best matching across all sites are computationally expensive, we further design a minimum spanning tree based approach to ensure that a node from each site will have matched peers from other sites, such that the total pairwise distances across all sites are minimized. Experiments and comparisons demonstrate that FedDNA outperforms commonly used baseline, such as FedAvg, for federated learning.
    Keywords Medicine ; R ; Science ; Q
    Subject code 006
    Language English
    Publishing date 2023-01-01T00:00:00Z
    Publisher Public Library of Science (PLoS)
    Document type Article ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  2. Article ; Online: Exploring investor-business-market interplay for business success prediction

    Divya Gangwani / Xingquan Zhu / Borko Furht

    Journal of Big Data, Vol 10, Iss 1, Pp 1-

    2023  Volume 28

    Abstract: Abstract The success of the business directly contributes towards the growth of the nation. Hence it is important to evaluate and predict whether the business will be successful or not. In this study, we use the company’s dataset which contains ... ...

    Abstract Abstract The success of the business directly contributes towards the growth of the nation. Hence it is important to evaluate and predict whether the business will be successful or not. In this study, we use the company’s dataset which contains information from startups to Fortune 1000 companies to create a machine learning model for predicting business success. The main challenge of business success prediction is twofold: (1) Identifying variables for defining business success; (2) Feature selection and feature engineering based on Investor-Business-Market interrelation to provide a successful outcome of the predictive modeling. Many studies have been carried out using only the available features to predict business success, however, there is still a challenge to identify the most important features in different business angles and their interrelation with business success. Motivated by the above challenge, we propose a new approach by defining a new business target based on the definition of business success used in this study and develop additional features by carrying out statistical analysis on the training data which highlights the importance of investments, business, and market features in forecasting business success instead of using only the available features for modeling. Ensemble machine learning methods as well as existing supervised learning methods were applied to predict business success. The results demonstrated a significant improvement in the overall accuracy and AUC score using ensemble methods. By adding new features related to the Investor-Business-Market entity demonstrated good performance in predicting business success and proved how important it is to identify significant relationships between these features to cover different business angles when predicting business success. Graphical Abstract
    Keywords Machine learning methods ; Investments-business-market ; Feature engineering ; Success prediction ; Computer engineering. Computer hardware ; TK7885-7895 ; Information technology ; T58.5-58.64 ; Electronic computers. Computer science ; QA75.5-76.95
    Subject code 650
    Language English
    Publishing date 2023-04-01T00:00:00Z
    Publisher SpringerOpen
    Document type Article ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  3. Article ; Online: Predictive modeling of clinical trial terminations using feature engineering and embedding learning

    Magdalyn E. Elkin / Xingquan Zhu

    Scientific Reports, Vol 11, Iss 1, Pp 1-

    2021  Volume 12

    Abstract: Abstract In this study, we propose to use machine learning to understand terminated clinical trials. Our goal is to answer two fundamental questions: (1) what are common factors/markers associated to terminated clinical trials? and (2) how to accurately ... ...

    Abstract Abstract In this study, we propose to use machine learning to understand terminated clinical trials. Our goal is to answer two fundamental questions: (1) what are common factors/markers associated to terminated clinical trials? and (2) how to accurately predict whether a clinical trial may be terminated or not? The answer to the first question provides effective ways to understand characteristics of terminated trials for stakeholders to better plan their trials; and the answer to the second question can direct estimate the chance of success of a clinical trial in order to minimize costs. By using 311,260 trials to build a testbed with 68,999 samples, we use feature engineering to create 640 features, reflecting clinical trial administration, eligibility, study information, criteria etc. Using feature ranking, a handful of features, such as trial eligibility, trial inclusion/exclusion criteria, sponsor types etc., are found to be related to the clinical trial termination. By using sampling and ensemble learning, we achieve over 67% Balanced Accuracy and over 0.73 AUC (Area Under the Curve) scores to correctly predict clinical trial termination, indicating that machine learning can help achieve satisfactory prediction results for clinical trial study.
    Keywords Medicine ; R ; Science ; Q
    Subject code 610 ; 310
    Language English
    Publishing date 2021-02-01T00:00:00Z
    Publisher Nature Portfolio
    Document type Article ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  4. Article ; Online: Understanding and predicting COVID-19 clinical trial completion vs. cessation.

    Magdalyn E Elkin / Xingquan Zhu

    PLoS ONE, Vol 16, Iss 7, p e

    2021  Volume 0253789

    Abstract: As of March 30 2021, over 5,193 COVID-19 clinical trials have been registered through Clinicaltrial.gov. Among them, 191 trials were terminated, suspended, or withdrawn (indicating the cessation of the study). On the other hand, 909 trials have been ... ...

    Abstract As of March 30 2021, over 5,193 COVID-19 clinical trials have been registered through Clinicaltrial.gov. Among them, 191 trials were terminated, suspended, or withdrawn (indicating the cessation of the study). On the other hand, 909 trials have been completed (indicating the completion of the study). In this study, we propose to study underlying factors of COVID-19 trial completion vs. cessation, and design predictive models to accurately predict whether a COVID-19 trial may complete or cease in the future. We collect 4,441 COVID-19 trials from ClinicalTrial.gov to build a testbed, and design four types of features to characterize clinical trial administration, eligibility, study information, criteria, drug types, study keywords, as well as embedding features commonly used in the state-of-the-art machine learning. Our study shows that drug features and study keywords are most informative features, but all four types of features are essential for accurate trial prediction. By using predictive models, our approach achieves more than 0.87 AUC (Area Under the Curve) score and 0.81 balanced accuracy to correctly predict COVID-19 clinical trial completion vs. cessation. Our research shows that computational methods can deliver effective features to understand difference between completed vs. ceased COVID-19 trials. In addition, such models can also predict COVID-19 trial status with satisfactory accuracy, and help stakeholders better plan trials and minimize costs.
    Keywords Medicine ; R ; Science ; Q
    Subject code 610
    Language English
    Publishing date 2021-01-01T00:00:00Z
    Publisher Public Library of Science (PLoS)
    Document type Article ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  5. Article: Cross-Domain Semi-Supervised Learning Using Feature Formulation.

    Xingquan Zhu

    IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics : a publication of the IEEE Systems, Man, and Cybernetics Society

    2011  Volume 41, Issue 6, Page(s) 1627–1638

    Abstract: Semi-Supervised Learning (SSL) traditionally makes use of unlabeled samples by including them into the training set through an automated labeling process. Such a primitive Semi-Supervised Learning (pSSL) approach suffers from a number of disadvantages ... ...

    Abstract Semi-Supervised Learning (SSL) traditionally makes use of unlabeled samples by including them into the training set through an automated labeling process. Such a primitive Semi-Supervised Learning (pSSL) approach suffers from a number of disadvantages including false labeling and incapable of utilizing out-of-domain samples. In this paper, we propose a formative Semi-Supervised Learning (fSSL) framework which explores hidden features between labeled and unlabeled samples to achieve semi-supervised learning. fSSL regards that both labeled and unlabeled samples are generated from some hidden concepts with labeling information partially observable for some samples. The key of the fSSL is to recover the hidden concepts, and take them as new features to link labeled and unlabeled samples for semi-supervised learning. Because unlabeled samples are only used to generate new features, but not to be explicitly included in the training set like pSSL does, fSSL overcomes the inherent disadvantages of the traditional pSSL methods, especially for samples not within the same domain as the labeled instances. Experimental results and comparisons demonstrate that fSSL significantly outperforms pSSL-based methods for both within-domain and cross-domain semi-supervised learning.
    Language English
    Publishing date 2011-12
    Publishing country United States
    Document type Journal Article
    ISSN 1083-4419
    ISSN 1083-4419
    DOI 10.1109/TSMCB.2011.2157999
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  6. Article ; Online: New insights into developmental biology of Eimeria tenella revealed by comparative analysis of mRNA N6-methyladenosine modification between unsporulated oocysts and sporulated oocysts

    Qing Liu / Bingjin Mu / Yijing Meng / Linmei Yu / Zirui Wang / Tao Jia / Wenbin Zheng / Wenwei Gao / Shichen Xie / Xingquan Zhu

    Journal of Integrative Agriculture, Vol 23, Iss 1, Pp 239-

    2024  Volume 250

    Abstract: Evidence showed that N6-methyladenosine (m6A) modification plays a pivotal role in influencing RNA fate and is strongly associated with cell growth and developmental processes in many species. However, no information regarding m6A modification in Eimeria ...

    Abstract Evidence showed that N6-methyladenosine (m6A) modification plays a pivotal role in influencing RNA fate and is strongly associated with cell growth and developmental processes in many species. However, no information regarding m6A modification in Eimeria tenella is currently available. In the present study, we surveyed the transcriptome-wide prevalence of m6A in sporulated oocysts and unsporulated oocysts of E. tenella. Methylated RNA immunoprecipitation sequencing (MeRIP-seq) analysis showed that m6A modification was most abundant in the coding sequences, followed by stop codon. There were 3,903 hypermethylated and 3,178 hypomethylated mRNAs in sporulated oocysts compared with unsporulated oocysts. Further joint analysis suggested that m6A modification of the majority of genes was positively correlated with mRNA expression. The mRNA relative expression and m6A level of the selected genes were confirmed by quantitative reverse transcription PCR (RT-qPCR) and MeRIP-qPCR. GO and KEGG analysis indicated that differentially m6A methylated genes (DMMGs) with significant differences in mRNA expression were closely related to processes such as regulation of gene expression, epigenetic, microtubule, autophagy-other and TOR signaling. Moreover, a total of 96 DMMGs without significant differences in mRNA expression showed significant differences at protein level. GO and pathway enrichment analysis of the 96 genes showed that RNA methylation may be involved in cell biosynthesis and metabolism of E. tenella. We firstly present a map of RNA m6A modification in E. tenella, which provides significant insights into developmental biology of E. tenella.
    Keywords Eimeria tenella ; m6A ; RNA methylation ; MeRIP-seq ; RNA-seq ; proteomic analysis ; Agriculture (General) ; S1-972
    Subject code 612
    Language English
    Publishing date 2024-01-01T00:00:00Z
    Publisher Elsevier
    Document type Article ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  7. Article ; Online: Deep Learning for User Interest and Response Prediction in Online Display Advertising

    Zhabiz Gharibshah / Xingquan Zhu / Arthur Hainline / Michael Conway

    Data Science and Engineering, Vol 5, Iss 1, Pp 12-

    2020  Volume 26

    Abstract: Abstract User interest and behavior modeling is a critical step in online digital advertising. On the one hand, user interests directly impact their response and actions to the displayed advertisement (Ad). On the other hand, user interests can further ... ...

    Abstract Abstract User interest and behavior modeling is a critical step in online digital advertising. On the one hand, user interests directly impact their response and actions to the displayed advertisement (Ad). On the other hand, user interests can further help determine the probability of an Ad viewer becoming a buying customer. To date, existing methods for Ad click prediction, or click-through rate prediction, mainly consider representing users as a static feature set and train machine learning classifiers to predict clicks. Such approaches do not consider temporal variance and changes in user behaviors, and solely rely on given features for learning. In this paper, we propose two deep learning-based frameworks, $${\hbox {LSTM}}_{\mathrm{cp}}$$ LSTM cp and $${\hbox {LSTM}}_{\mathrm{ip}}$$ LSTM ip , for user click prediction and user interest modeling. Our goal is to accurately predict (1) the probability of a user clicking on an Ad and (2) the probability of a user clicking a specific type of Ad campaign. To achieve the goal, we collect page information displayed to the users as a temporal sequence and use long short-term memory (LSTM) network to learn features that represents user interests as latent features. Experiments and comparisons on real-world data show that, compared to existing static set-based approaches, considering sequences and temporal variance of user requests results in improvements in user Ad response prediction and campaign specific user Ad click prediction.
    Keywords Click prediction ; Display advertising ; Campaign ; LSTM network ; Deep learning ; Information technology ; T58.5-58.64 ; Electronic computers. Computer science ; QA75.5-76.95
    Subject code 004
    Language English
    Publishing date 2020-01-01T00:00:00Z
    Publisher SpringerOpen
    Document type Article ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  8. Article ; Online: NOSEP: Nonoverlapping Sequence Pattern Mining With Gap Constraints.

    Youxi Wu / Yao Tong / Xingquan Zhu / Xindong Wu

    IEEE transactions on cybernetics

    2017  Volume 48, Issue 10, Page(s) 2809–2822

    Abstract: Sequence pattern mining aims to discover frequent subsequences as patterns in a single sequence or a sequence database. By combining gap constraints (or flexible wildcards), users can specify special characteristics of the patterns and discover ... ...

    Abstract Sequence pattern mining aims to discover frequent subsequences as patterns in a single sequence or a sequence database. By combining gap constraints (or flexible wildcards), users can specify special characteristics of the patterns and discover meaningful subsequences suitable for their own application domains, such as finding gene transcription sites from DNA sequences or discovering patterns for time series data classification. Due to the inherent complexity of sequence patterns, including the exponential candidate space with respect to pattern letters and gap constraints, to date, existing sequence pattern mining methods are either incomplete or do not support the Apriori property because the support ratio of a pattern may be greater than that of its subpatterns. Most importantly, patterns discovered by these methods are either too restrictive or too general and cannot represent underlying meaningful knowledge in the sequences. In this paper, we focus on a nonoverlapping sequence pattern mining task with gap constraints, where a nonoverlapping sequence pattern allows sequence letters to be flexibly and maximally utilized for pattern discovery. A new Apriori-based nonoverlapping sequence pattern mining algorithm, NOSEP, is proposed. NOSEP is a complete pattern mining algorithm, which uses a specially designed data structure, Nettree, to calculate the exact occurrence of a pattern in the sequence. Experimental results and comparisons on biology DNA sequences, time series data, and Gazelle datasets demonstrate the efficiency of the proposed algorithm and the uniqueness of nonoverlapping sequence patterns compared to other methods.
    Language English
    Publishing date 2017-09-28
    Publishing country United States
    Document type Journal Article
    ISSN 2168-2275
    ISSN (online) 2168-2275
    DOI 10.1109/TCYB.2017.2750691
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  9. Article ; Online: Deep learning data augmentation for Raman spectroscopy cancer tissue classification

    Man Wu / Shuwen Wang / Shirui Pan / Andrew C. Terentis / John Strasswimmer / Xingquan Zhu

    Scientific Reports, Vol 11, Iss 1, Pp 1-

    2021  Volume 13

    Abstract: Abstract Recently, Raman Spectroscopy (RS) was demonstrated to be a non-destructive way of cancer diagnosis, due to the uniqueness of RS measurements in revealing molecular biochemical changes between cancerous vs. normal tissues and cells. In order to ... ...

    Abstract Abstract Recently, Raman Spectroscopy (RS) was demonstrated to be a non-destructive way of cancer diagnosis, due to the uniqueness of RS measurements in revealing molecular biochemical changes between cancerous vs. normal tissues and cells. In order to design computational approaches for cancer detection, the quality and quantity of tissue samples for RS are important for accurate prediction. In reality, however, obtaining skin cancer samples is difficult and expensive due to privacy and other constraints. With a small number of samples, the training of the classifier is difficult, and often results in overfitting. Therefore, it is important to have more samples to better train classifiers for accurate cancer tissue classification. To overcome these limitations, this paper presents a novel generative adversarial network based skin cancer tissue classification framework. Specifically, we design a data augmentation module that employs a Generative Adversarial Network (GAN) to generate synthetic RS data resembling the training data classes. The original tissue samples and the generated data are concatenated to train classification modules. Experiments on real-world RS data demonstrate that (1) data augmentation can help improve skin cancer tissue classification accuracy, and (2) generative adversarial network can be used to generate reliable synthetic Raman spectroscopic data.
    Keywords Medicine ; R ; Science ; Q
    Subject code 006
    Language English
    Publishing date 2021-12-01T00:00:00Z
    Publisher Nature Portfolio
    Document type Article ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  10. Article ; Online: Identification and Protective Efficacy of Eimeria tenella Rhoptry Kinase Family Protein 17

    Xiaoxin Liu / Bingjin Mu / Wenbin Zheng / Yijing Meng / Linmei Yu / Wenwei Gao / Xingquan Zhu / Qing Liu

    Animals, Vol 12, Iss 556, p

    2022  Volume 556

    Abstract: Eimeria tenella encodes a genome of approximately 8000 genes. To date, however, very few data are available regarding E. tenella rhoptry kinase family proteins. In the present study, the gene fragment encoding the mature peptide of the rhoptry kinase ... ...

    Abstract Eimeria tenella encodes a genome of approximately 8000 genes. To date, however, very few data are available regarding E. tenella rhoptry kinase family proteins. In the present study, the gene fragment encoding the mature peptide of the rhoptry kinase family protein 17 of E. tenella (EtROP17) was amplified by PCR and expressed in E . coli. Then, we generated polyclonal antibodies that recognize EtROP17 and investigated the expression of EtROP17 in the merozoite stage of E. tenella by immunofluorescent staining and Western blot analysis. Meanwhile, the protective efficacy of rEtROP17 against E. tenella was evaluated in chickens. Sequencing analysis showed that a single base difference at sequence position 1901 was observed between the SD-01 strain and the Houghton strain. EtROP17 was expressed in the merozoite stage of E. tenella. The results of the animal challenge experiments demonstrated that vaccination with rEtROP17 significantly reduced cecal lesions and oocyst outputs compared with the challenged control group. Our findings indicate that EtROP17 could serve as a potential candidate for developing a new vaccine against E. tenella .
    Keywords Eimeria tenella ; rhoptry kinase family protein 17 ; merozoite ; protective efficacy ; Veterinary medicine ; SF600-1100 ; Zoology ; QL1-991
    Subject code 572
    Language English
    Publishing date 2022-02-01T00:00:00Z
    Publisher MDPI AG
    Document type Article ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

To top