LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 8 of total 8

Search options

  1. Book ; Online: GPTAraEval

    Khondaker, Md Tawkat Islam / Waheed, Abdul / Nagoudi, El Moatez Billah / Abdul-Mageed, Muhammad

    A Comprehensive Evaluation of ChatGPT on Arabic NLP

    2023  

    Abstract: The recent emergence of ChatGPT has brought a revolutionary change in the landscape of NLP. Although ChatGPT has consistently shown impressive performance on English benchmarks, its exact capabilities on most other languages remain largely unknown. To ... ...

    Abstract The recent emergence of ChatGPT has brought a revolutionary change in the landscape of NLP. Although ChatGPT has consistently shown impressive performance on English benchmarks, its exact capabilities on most other languages remain largely unknown. To better understand ChatGPT's capabilities on Arabic, we present a large-scale evaluation of the model on a broad range of Arabic NLP tasks. Namely, we evaluate ChatGPT on 32 diverse natural language understanding and generation tasks on over 60 different datasets. To the best of our knowledge, our work offers the first performance analysis of ChatGPT on Arabic NLP at such a massive scale. Our results show that, despite its success on English benchmarks, ChatGPT trained in-context (few-shot) is consistently outperformed by much smaller dedicated models finetuned on Arabic. These results suggest that there is significant place for improvement for instruction-tuned LLMs such as ChatGPT.

    Comment: Work in progress
    Keywords Computer Science - Computation and Language ; Computer Science - Machine Learning
    Subject code 006
    Publishing date 2023-05-24
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  2. Book ; Online: JASMINE

    Nagoudi, El Moatez Billah / Abdul-Mageed, Muhammad / Elmadany, AbdelRahim / Inciarte, Alcides Alcoba / Khondaker, Md Tawkat Islam

    Arabic GPT Models for Few-Shot Learning

    2022  

    Abstract: Task agnostic generative pretraining (GPT) has recently proved promising for zero- and few-shot learning, gradually diverting attention from the expensive supervised learning paradigm. Although the community is accumulating knowledge as to capabilities ... ...

    Abstract Task agnostic generative pretraining (GPT) has recently proved promising for zero- and few-shot learning, gradually diverting attention from the expensive supervised learning paradigm. Although the community is accumulating knowledge as to capabilities of English-language autoregressive models such as GPT-3 adopting this generative approach, scholarship about these models remains acutely Anglocentric. Consequently, the community currently has serious gaps in its understanding of this class of models, their potential, and their societal impacts in diverse settings, linguistic traditions, and cultures. To alleviate this issue for Arabic, a collection of diverse languages and language varieties with more than $400$ million population, we introduce JASMINE, a suite of powerful Arabic autoregressive Transformer language models ranging in size between 300 million-13 billion parameters. We pretrain our new models with large amounts of diverse data (400GB of text) from different Arabic varieties and domains. We evaluate JASMINE extensively in both intrinsic and extrinsic settings, using a comprehensive benchmark for zero- and few-shot learning across a wide range of NLP tasks. We also carefully develop and release a novel benchmark for both automated and human evaluation of Arabic autoregressive models focused at investigating potential social biases, harms, and toxicity in these models. We aim to responsibly release our models with interested researchers, along with code for experimenting with them
    Keywords Computer Science - Computation and Language
    Publishing date 2022-12-20
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  3. Article: Obesity in Qatar: A Case-Control Study on the Identification of Associated Risk Factors.

    Khondaker, Md Tawkat Islam / Khan, Junaed Younus / Refaee, Mahmoud Ahmed / Hajj, Nady El / Rahman, M Sohel / Alam, Tanvir

    Diagnostics (Basel, Switzerland)

    2020  Volume 10, Issue 11

    Abstract: Obesity is an emerging public health problem in the Western world as well as in the Gulf region. Qatar, a tiny wealthy county, is among the top-ranked obese countries with a high obesity rate among its population. Compared to Qatar's severity of this ... ...

    Abstract Obesity is an emerging public health problem in the Western world as well as in the Gulf region. Qatar, a tiny wealthy county, is among the top-ranked obese countries with a high obesity rate among its population. Compared to Qatar's severity of this health crisis, only a limited number of studies focused on the systematic identification of potential risk factors using multimodal datasets. This study aims to develop machine learning (ML) models to distinguish healthy from obese individuals and reveal potential risk factors associated with obesity in Qatar. We designed a case-control study focused on 500 Qatari subjects, comprising 250 obese and 250 healthy individuals- the later forming the control group. We obtained the most extensive collection of clinical measurements for the Qatari population from the Qatar Biobank (QBB) repertoire, including (i) Physio-clinical Biomarkers, (ii) Spirometry, (iii) VICORDER, (iv) DXA scan composition, and (v) DXA scan densitometry readings. We developed several machine learning (ML) models to distinguish healthy from obese individuals and applied multiple feature selection techniques to identify potential risk factors associated with obesity. The proposed ML model achieved over 90% accuracy, thereby outperforming the existing state of the art models. The outcome from the ablation study on multimodal clinical datasets revealed physio-clinical measurements as the most influential risk factors in distinguishing healthy versus obese subjects. Furthermore, multiple feature ranking techniques confirmed known obesity risk factors (c-peptide, insulin, albumin, uric acid) and identified potential risk factors linked to obesity-related comorbidities such as diabetes (e.g., HbA1c, glucose), liver function (e.g., alkaline phosphatase, gamma-glutamyl transferase), lipid profile (e.g., triglyceride, low density lipoprotein cholesterol, high density lipoprotein cholesterol), etc. Most of the DXA measurements (e.g., bone area, bone mineral composition, bone mineral density, etc.) were significantly (
    Language English
    Publishing date 2020-10-29
    Publishing country Switzerland
    Document type Journal Article
    ZDB-ID 2662336-5
    ISSN 2075-4418
    ISSN 2075-4418
    DOI 10.3390/diagnostics10110883
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  4. Book ; Online: A Benchmark Study of Machine Learning Models for Online Fake News Detection

    Khan, Junaed Younus / Khondaker, Md. Tawkat Islam / Afroz, Sadia / Uddin, Gias / Iqbal, Anindya

    2019  

    Abstract: The proliferation of fake news and its propagation on social media has become a major concern due to its ability to create devastating impacts. Different machine learning approaches have been suggested to detect fake news. However, most of those focused ... ...

    Abstract The proliferation of fake news and its propagation on social media has become a major concern due to its ability to create devastating impacts. Different machine learning approaches have been suggested to detect fake news. However, most of those focused on a specific type of news (such as political) which leads us to the question of dataset-bias of the models used. In this research, we conducted a benchmark study to assess the performance of different applicable machine learning approaches on three different datasets where we accumulated the largest and most diversified one. We explored a number of advanced pre-trained language models for fake news detection along with the traditional and deep learning ones and compared their performances from different aspects for the first time to the best of our knowledge. We find that BERT and similar pre-trained models perform the best for fake news detection, especially with very small dataset. Hence, these models are significantly better option for languages with limited electronic contents, i.e., training data. We also carried out several analysis based on the models' performance, article's topic, article's length, and discussed different lessons learned from them. We believe that this benchmark study will help the research community to explore further and news sites/blogs to select the most appropriate fake news detection method.

    Comment: 22 pages, 5 figures, to be published in Machine Learning with Applications journal
    Keywords Computer Science - Computation and Language ; Computer Science - Information Retrieval ; Computer Science - Machine Learning ; Statistics - Machine Learning
    Subject code 006
    Publishing date 2019-05-12
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  5. Article: COVID-19Base: A knowledgebase to explore biomedical entities related to COVID-19

    Khan, Junaed Younus / Khondaker, Md. Tawkat Islam / Hoque, Iram Tazim / Al-Absi, Hamada / Rahman, Mohammad Saifur / Alam, Tanvir / Rahman, M. Sohel

    Abstract: We are presenting COVID-19Base, a knowledgebase highlighting the biomedical entities related to COVID-19 disease based on literature mining. To develop COVID-19Base, we mine the information from publicly available scientific literature and related public ...

    Abstract We are presenting COVID-19Base, a knowledgebase highlighting the biomedical entities related to COVID-19 disease based on literature mining. To develop COVID-19Base, we mine the information from publicly available scientific literature and related public resources. We considered seven topic-specific dictionaries, including human genes, human miRNAs, human lncRNAs, diseases, Protein Databank, drugs, and drug side effects, are integrated to mine all scientific evidence related to COVID-19. We have employed an automated literature mining and labeling system through a novel approach to measure the effectiveness of drugs against diseases based on natural language processing, sentiment analysis, and deep learning. To the best of our knowledge, this is the first knowledgebase dedicated to COVID-19, which integrates such large variety of related biomedical entities through literature mining. Proper investigation of the mined biomedical entities along with the identified interactions among those, reported in COVID-19Base, would help the research community to discover possible ways for the therapeutic treatment of COVID-19.
    Keywords covid19
    Publisher ArXiv
    Document type Article
    Database COVID19

    Kategorien

  6. Book ; Online: COVID-19Base

    Khan, Junaed Younus / Khondaker, Md. Tawkat Islam / Hoque, Iram Tazim / Al-Absi, Hamada / Rahman, Mohammad Saifur / Alam, Tanvir / Rahman, M. Sohel

    A knowledgebase to explore biomedical entities related to COVID-19

    2020  

    Abstract: We are presenting COVID-19Base, a knowledgebase highlighting the biomedical entities related to COVID-19 disease based on literature mining. To develop COVID-19Base, we mine the information from publicly available scientific literature and related public ...

    Abstract We are presenting COVID-19Base, a knowledgebase highlighting the biomedical entities related to COVID-19 disease based on literature mining. To develop COVID-19Base, we mine the information from publicly available scientific literature and related public resources. We considered seven topic-specific dictionaries, including human genes, human miRNAs, human lncRNAs, diseases, Protein Databank, drugs, and drug side effects, are integrated to mine all scientific evidence related to COVID-19. We have employed an automated literature mining and labeling system through a novel approach to measure the effectiveness of drugs against diseases based on natural language processing, sentiment analysis, and deep learning. To the best of our knowledge, this is the first knowledgebase dedicated to COVID-19, which integrates such large variety of related biomedical entities through literature mining. Proper investigation of the mined biomedical entities along with the identified interactions among those, reported in COVID-19Base, would help the research community to discover possible ways for the therapeutic treatment of COVID-19.

    Comment: 10 pages, 3 figures
    Keywords Computer Science - Information Retrieval ; Computer Science - Computation and Language ; Computer Science - Digital Libraries ; Computer Science - Machine Learning ; Quantitative Biology - Quantitative Methods ; covid19
    Subject code 006
    Publishing date 2020-05-12
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  7. Article: Toward Preparing a Knowledge Base to Explore Potential Drugs and Biomedical Entities Related to COVID-19: Automated Computational Approach.

    Khan, Junaed Younus / Khondaker, Md Tawkat Islam / Hoque, Iram Tazim / Al-Absi, Hamada R H / Rahman, Mohammad Saifur / Guler, Reto / Alam, Tanvir / Rahman, M Sohel

    JMIR medical informatics

    2020  Volume 8, Issue 11, Page(s) e21648

    Abstract: Background: Novel coronavirus disease 2019 (COVID-19) is taking a huge toll on public health. Along with the non-therapeutic preventive measurements, scientific efforts are currently focused, mainly, on the development of vaccines and pharmacological ... ...

    Abstract Background: Novel coronavirus disease 2019 (COVID-19) is taking a huge toll on public health. Along with the non-therapeutic preventive measurements, scientific efforts are currently focused, mainly, on the development of vaccines and pharmacological treatment with existing drugs. Summarizing evidences from scientific literatures on the discovery of treatment plan of COVID-19 under a platform would help the scientific community to explore the opportunities in a systematic fashion.
    Objective: The aim of this study is to explore the potential drugs and biomedical entities related to coronavirus related diseases, including COVID-19, that are mentioned on scientific literature through an automated computational approach.
    Methods: We mined the information from publicly available scientific literature and related public resources. Six topic-specific dictionaries, including human genes, human miRNAs, diseases, Protein Databank, drugs, and drug side effects, were integrated to mine all scientific evidence related to COVID-19. We employed an automated literature mining and labeling system through a novel approach to measure the effectiveness of drugs against diseases based on natural language processing, sentiment analysis, and deep learning. We also applied the concept of cosine similarity to confidently infer the associations between diseases and genes.
    Results: Based on the literature mining, we identified 1805 diseases, 2454 drugs, 1910 genes that are related to coronavirus related diseases including COVID-19. Integrating the extracted information, we developed the first knowledgebase platform dedicated to COVID-19, which highlights potential list of drugs and related biomedical entities. For COVID-19, we highlighted multiple case studies on existing drugs along with a confidence score for their applicability in the treatment plan. Based on our computational method, we found Remdesivir, Statins, Dexamethasone, and Ivermectin could be considered as potential effective drugs to improve clinical status and lower mortality in patients hospitalized with COVID-19. We also found that Hydroxychloroquine could not be considered as an effective drug for COVID-19. The resulting knowledgebase is made available as an open source tool, named COVID-19Base.
    Conclusions: Proper investigation of the mined biomedical entities along with the identified interactions among those would help the research community to discover possible ways for the therapeutic treatment of COVID-19.
    Keywords covid19
    Language English
    Publishing date 2020-11-10
    Publishing country Canada
    Document type Journal Article
    ZDB-ID 2798261-0
    ISSN 2291-9694
    ISSN 2291-9694
    DOI 10.2196/21648
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  8. Article ; Online: Toward Preparing a Knowledge Base to Explore Potential Drugs and Biomedical Entities Related to COVID-19

    Khan, Junaed Younus / Khondaker, Md Tawkat Islam / Hoque, Iram Tazim / Al-Absi, Hamada R H / Rahman, Mohammad Saifur / Guler, Reto / Alam, Tanvir / Rahman, M Sohel

    JMIR Medical Informatics, Vol 8, Iss 11, p e

    Automated Computational Approach

    2020  Volume 21648

    Abstract: BackgroundNovel coronavirus disease 2019 (COVID-19) is taking a huge toll on public health. Along with the non-therapeutic preventive measurements, scientific efforts are currently focused, mainly, on the development of vaccines and pharmacological ... ...

    Abstract BackgroundNovel coronavirus disease 2019 (COVID-19) is taking a huge toll on public health. Along with the non-therapeutic preventive measurements, scientific efforts are currently focused, mainly, on the development of vaccines and pharmacological treatment with existing drugs. Summarizing evidences from scientific literatures on the discovery of treatment plan of COVID-19 under a platform would help the scientific community to explore the opportunities in a systematic fashion. ObjectiveThe aim of this study is to explore the potential drugs and biomedical entities related to coronavirus related diseases, including COVID-19, that are mentioned on scientific literature through an automated computational approach. MethodsWe mined the information from publicly available scientific literature and related public resources. Six topic-specific dictionaries, including human genes, human miRNAs, diseases, Protein Databank, drugs, and drug side effects, were integrated to mine all scientific evidence related to COVID-19. We employed an automated literature mining and labeling system through a novel approach to measure the effectiveness of drugs against diseases based on natural language processing, sentiment analysis, and deep learning. We also applied the concept of cosine similarity to confidently infer the associations between diseases and genes. ResultsBased on the literature mining, we identified 1805 diseases, 2454 drugs, 1910 genes that are related to coronavirus related diseases including COVID-19. Integrating the extracted information, we developed the first knowledgebase platform dedicated to COVID-19, which highlights potential list of drugs and related biomedical entities. For COVID-19, we highlighted multiple case studies on existing drugs along with a confidence score for their applicability in the treatment plan. Based on our computational method, we found Remdesivir, Statins, Dexamethasone, and Ivermectin could be considered as potential effective drugs to improve clinical status and lower mortality in ...
    Keywords Computer applications to medicine. Medical informatics ; R858-859.7
    Subject code 610
    Language English
    Publishing date 2020-11-01T00:00:00Z
    Publisher JMIR Publications
    Document type Article ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

To top