LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 11

Search options

  1. Article ; Online: Design and analysis of a large-scale COVID-19 tweets dataset.

    Lamsal, Rabindra

    Applied intelligence (Dordrecht, Netherlands)

    2020  Volume 51, Issue 5, Page(s) 2790–2804

    Abstract: ... to any crisis. This paper presents COV19Tweets Dataset (Lamsal 2020a), a large-scale Twitter dataset with more ... version, the GeoCOV19Tweets Dataset (Lamsal 2020b), is also presented. The paper discusses the datasets' ... of the public discourse related to the ongoing pandemic. As per the stats, the datasets (Lamsal 2020a, 2020b ...

    Abstract As of July 17, 2020, more than thirteen million people have been diagnosed with the Novel Coronavirus (COVID-19), and half a million people have already lost their lives due to this infectious disease. The World Health Organization declared the COVID-19 outbreak as a pandemic on March 11, 2020. Since then, social media platforms have experienced an exponential rise in the content related to the pandemic. In the past, Twitter data have been observed to be indispensable in the extraction of situational awareness information relating to any crisis. This paper presents COV19Tweets Dataset (Lamsal 2020a), a large-scale Twitter dataset with more than 310 million COVID-19 specific English language tweets and their sentiment scores. The dataset's geo version, the GeoCOV19Tweets Dataset (Lamsal 2020b), is also presented. The paper discusses the datasets' design in detail, and the tweets in both the datasets are analyzed. The datasets are released publicly, anticipating that they would contribute to a better understanding of spatial and temporal dimensions of the public discourse related to the ongoing pandemic. As per the stats, the datasets (Lamsal 2020a, 2020b) have been accessed over 74.5k times, collectively.
    Language English
    Publishing date 2020-11-06
    Publishing country Netherlands
    Document type Journal Article
    ZDB-ID 1479519-X
    ISSN 1573-7497 ; 0924-669X
    ISSN (online) 1573-7497
    ISSN 0924-669X
    DOI 10.1007/s10489-020-02029-z
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  2. Article ; Online: BillionCOV: An enriched billion-scale collection of COVID-19 tweets for efficient hydration.

    Lamsal, Rabindra / Read, Maria Rodriguez / Karunasekera, Shanika

    Data in brief

    2023  Volume 48, Page(s) 109229

    Abstract: The COVID-19 pandemic has introduced new norms, such as social distancing, face masks, quarantine, lockdowns, travel restrictions, work/study from home, and business closures, to name a few. The pandemic's seriousness has made people vocal on social ... ...

    Abstract The COVID-19 pandemic has introduced new norms, such as social distancing, face masks, quarantine, lockdowns, travel restrictions, work/study from home, and business closures, to name a few. The pandemic's seriousness has made people vocal on social media, especially on microblogs such as Twitter. Since the early days of the outbreak, researchers have been collecting and sharing large-scale datasets of COVID-19 tweets. However, the existing datasets carry issues related to
    Language English
    Publishing date 2023-05-12
    Publishing country Netherlands
    Document type Journal Article
    ZDB-ID 2786545-9
    ISSN 2352-3409 ; 2352-3409
    ISSN (online) 2352-3409
    ISSN 2352-3409
    DOI 10.1016/j.dib.2023.109229
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  3. Article ; Online: Design and analysis of a large-scale COVID-19 tweets dataset

    Lamsal, Rabindra

    Appl Intell

    Abstract: ... to any crisis. This paper presents COV19Tweets Dataset (Lamsal 2020a), a large-scale Twitter dataset with more ... version, the GeoCOV19Tweets Dataset (Lamsal 2020b), is also presented. The paper discusses the datasets ... of the public discourse related to the ongoing pandemic. As per the stats, the datasets (Lamsal 2020a, 2020b ...

    Abstract As of July 17, 2020, more than thirteen million people have been diagnosed with the Novel Coronavirus (COVID-19), and half a million people have already lost their lives due to this infectious disease. The World Health Organization declared the COVID-19 outbreak as a pandemic on March 11, 2020. Since then, social media platforms have experienced an exponential rise in the content related to the pandemic. In the past, Twitter data have been observed to be indispensable in the extraction of situational awareness information relating to any crisis. This paper presents COV19Tweets Dataset (Lamsal 2020a), a large-scale Twitter dataset with more than 310 million COVID-19 specific English language tweets and their sentiment scores. The dataset’s geo version, the GeoCOV19Tweets Dataset (Lamsal 2020b), is also presented. The paper discusses the datasets’ design in detail, and the tweets in both the datasets are analyzed. The datasets are released publicly, anticipating that they would contribute to a better understanding of spatial and temporal dimensions of the public discourse related to the ongoing pandemic. As per the stats, the datasets (Lamsal 2020a, 2020b) have been accessed over 74.5k times, collectively.
    Keywords covid19
    Publisher PMC
    Document type Article ; Online
    DOI 10.1007/s10489-020-02029-z
    Database COVID19

    Kategorien

  4. Article: Twitter conversations predict the daily confirmed COVID-19 cases.

    Lamsal, Rabindra / Harwood, Aaron / Read, Maria Rodriguez

    Applied soft computing

    2022  Volume 129, Page(s) 109603

    Abstract: As of writing this paper, COVID-19 (Coronavirus disease 2019) has spread to more than 220 countries and territories. Following the outbreak, the pandemic's seriousness has made people more active on social media, especially on the microblogging platforms ...

    Abstract As of writing this paper, COVID-19 (Coronavirus disease 2019) has spread to more than 220 countries and territories. Following the outbreak, the pandemic's seriousness has made people more active on social media, especially on the microblogging platforms such as Twitter and Weibo. The pandemic-specific discourse has remained on-trend on these platforms for months now. Previous studies have confirmed the contributions of such socially generated conversations towards situational awareness of crisis events. The early forecasts of cases are essential to authorities to estimate the requirements of resources needed to cope with the outgrowths of the virus. Therefore, this study attempts to incorporate the public discourse in the design of forecasting models particularly targeted for the steep-hill region of an ongoing wave. We propose a sentiment-involved topic-based latent variables search methodology for designing forecasting models from publicly available Twitter conversations. As a use case, we implement the proposed methodology on Australian COVID-19 daily cases and Twitter conversations generated within the country. Experimental results: (i) show the presence of latent social media variables that Granger-cause the daily COVID-19 confirmed cases, and (ii) confirm that those variables offer additional prediction capability to forecasting models. Further, the results show that the inclusion of social media variables introduces 48.83%-51.38% improvements on RMSE over the baseline models. We also release the large-scale COVID-19 specific geotagged global tweets dataset,
    Language English
    Publishing date 2022-09-05
    Publishing country United States
    Document type Journal Article
    ISSN 1568-4946
    ISSN 1568-4946
    DOI 10.1016/j.asoc.2022.109603
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  5. Book ; Online: A Twitter narrative of the COVID-19 pandemic in Australia

    Lamsal, Rabindra / Read, Maria Rodriguez / Karunasekera, Shanika

    2023  

    Abstract: Social media platforms contain abundant data that can provide comprehensive knowledge of historical and real-time events. During crisis events, the use of social media peaks, as people discuss what they have seen, heard, or felt. Previous studies confirm ...

    Abstract Social media platforms contain abundant data that can provide comprehensive knowledge of historical and real-time events. During crisis events, the use of social media peaks, as people discuss what they have seen, heard, or felt. Previous studies confirm the usefulness of such socially generated discussions for the public, first responders, and decision-makers to gain a better understanding of events as they unfold at the ground level. This study performs an extensive analysis of COVID-19-related Twitter discussions generated in Australia between January 2020, and October 2022. We explore the Australian Twitterverse by employing state-of-the-art approaches from both supervised and unsupervised domains to perform network analysis, topic modeling, sentiment analysis, and causality analysis. As the presented results provide a comprehensive understanding of the Australian Twitterverse during the COVID-19 pandemic, this study aims to explore the discussion dynamics to aid the development of future automated information systems for epidemic/pandemic management.

    Comment: Accepted to ISCRAM 2023
    Keywords Computer Science - Social and Information Networks
    Publishing date 2023-02-21
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  6. Book ; Online: BillionCOV

    Lamsal, Rabindra / Read, Maria Rodriguez / Karunasekera, Shanika

    An Enriched Billion-scale Collection of COVID-19 tweets for Efficient Hydration

    2023  

    Abstract: The COVID-19 pandemic introduced new norms such as social distancing, face masks, quarantine, lockdowns, travel restrictions, work/study from home, and business closures, to name a few. The pandemic's seriousness made people vocal on social media, ... ...

    Abstract The COVID-19 pandemic introduced new norms such as social distancing, face masks, quarantine, lockdowns, travel restrictions, work/study from home, and business closures, to name a few. The pandemic's seriousness made people vocal on social media, especially on microblogs such as Twitter. Researchers have been collecting and sharing large-scale datasets of COVID-19 tweets since the early days of the outbreak. Sharing raw Twitter data with third parties is restricted; users need to hydrate tweet identifiers in a public dataset to re-create the dataset locally. Large-scale datasets that include original tweets, retweets, quotes, and replies have tweets in billions which takes months to hydrate. The existing datasets carry issues related to proportion and redundancy. We report that more than 500 million tweet identifiers point to deleted or protected tweets. In order to address these issues, this paper introduces an enriched global billion-scale English-language COVID-19 tweets dataset, BillionCOV, that contains 1.4 billion tweets originating from 240 countries and territories between October 2019 and April 2022. Importantly, BillionCOV facilitates researchers to filter tweet identifiers for efficient hydration. This paper discusses associated methods to fetch raw Twitter data for a set of tweet identifiers, presents multiple tweets' distributions to provide an overview of BillionCOV, and finally, reviews the dataset's potential use cases.
    Keywords Computer Science - Social and Information Networks
    Subject code 410
    Publishing date 2023-01-18
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  7. Book ; Online: CrisisTransformers

    Lamsal, Rabindra / Read, Maria Rodriguez / Karunasekera, Shanika

    Pre-trained language models and sentence encoders for crisis-related social media texts

    2023  

    Abstract: Social media platforms play an essential role in crisis communication, but analyzing crisis-related social media texts is challenging due to their informal nature. Transformer-based pre-trained models like BERT and RoBERTa have shown success in various ... ...

    Abstract Social media platforms play an essential role in crisis communication, but analyzing crisis-related social media texts is challenging due to their informal nature. Transformer-based pre-trained models like BERT and RoBERTa have shown success in various NLP tasks, but they are not tailored for crisis-related texts. Furthermore, general-purpose sentence encoders are used to generate sentence embeddings, regardless of the textual complexities in crisis-related texts. Advances in applications like text classification, semantic search, and clustering contribute to effective processing of crisis-related texts, which is essential for emergency responders to gain a comprehensive view of a crisis event, whether historical or real-time. To address these gaps in crisis informatics literature, this study introduces CrisisTransformers, an ensemble of pre-trained language models and sentence encoders trained on an extensive corpus of over 15 billion word tokens from tweets associated with more than 30 crisis events, including disease outbreaks, natural disasters, conflicts, and other critical incidents. We evaluate existing models and CrisisTransformers on 18 crisis-specific public datasets. Our pre-trained models outperform strong baselines across all datasets in classification tasks, and our best-performing sentence encoder improves the state-of-the-art by 17.43% in sentence encoding tasks. Additionally, we investigate the impact of model initialization on convergence and evaluate the significance of domain-specific models in generating semantically meaningful sentence embeddings. All models are publicly released (https://huggingface.co/crisistransformers), with the anticipation that they will serve as a robust baseline for tasks involving the analysis of crisis-related social media texts.
    Keywords Computer Science - Computation and Language
    Subject code 070
    Publishing date 2023-09-11
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  8. Book ; Online: GeoCovaxTweets

    Singh, Pardeep / Lamsal, Rabindra / Monika / Chand, Satish / Shishodia, Bhawna

    COVID-19 Vaccines and Vaccination-specific Global Geotagged Twitter Conversations

    2023  

    Abstract: Social media platforms provide actionable information during crises and pandemic outbreaks. The COVID-19 pandemic has imposed a chronic public health crisis worldwide, with experts considering vaccines as the ultimate prevention to achieve herd immunity ... ...

    Abstract Social media platforms provide actionable information during crises and pandemic outbreaks. The COVID-19 pandemic has imposed a chronic public health crisis worldwide, with experts considering vaccines as the ultimate prevention to achieve herd immunity against the virus. A proportion of people may turn to social media platforms to oppose vaccines and vaccination, hindering government efforts to eradicate the virus. This paper presents the COVID-19 vaccines and vaccination-specific global geotagged tweets dataset, GeoCovaxTweets, that contains more than 1.8 million tweets, with location information and longer temporal coverage, originating from 233 countries and territories between January 2020 and November 2022. The paper discusses the dataset's curation method and how it can be re-created locally, and later explores the dataset through multiple tweets distributions and briefly discusses its potential use cases. We anticipate that the dataset will assist the researchers in the crisis computing domain to explore the conversational dynamics of COVID-19 vaccines and vaccination Twitter discourse through numerous spatial and temporal dimensions concerning trends, shifts in opinions, misinformation, and anti-vaccination campaigns.
    Keywords Computer Science - Social and Information Networks
    Subject code 306
    Publishing date 2023-01-18
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  9. Book ; Online: Where did you tweet from? Inferring the origin locations of tweets based on contextual information

    Lamsal, Rabindra / Harwood, Aaron / Read, Maria Rodriguez

    2022  

    Abstract: Public conversations on Twitter comprise many pertinent topics including disasters, protests, politics, propaganda, sports, climate change, epidemics/pandemic outbreaks, etc., that can have both regional and global aspects. Spatial discourse analysis ... ...

    Abstract Public conversations on Twitter comprise many pertinent topics including disasters, protests, politics, propaganda, sports, climate change, epidemics/pandemic outbreaks, etc., that can have both regional and global aspects. Spatial discourse analysis rely on geographical data. However, today less than 1% of tweets are geotagged; in both cases--point location or bounding place information. A major issue with tweets is that Twitter users can be at location A and exchange conversations specific to location B, which we call the Location A/B problem. The problem is considered solved if location entities can be classified as either origin locations (Location As) or non-origin locations (Location Bs). In this work, we propose a simple yet effective framework--the True Origin Model--to address the problem that uses machine-level natural language understanding to identify tweets that conceivably contain their origin location information. The model achieves promising accuracy at country (80%), state (67%), city (58%), county (56%) and district (64%) levels with support from a Location Extraction Model as basic as the CoNLL-2003-based RoBERTa. We employ a tweet contexualizer (locBERT) which is one of the core components of the proposed model, to investigate multiple tweets' distributions for understanding Twitter users' tweeting behavior in terms of mentioning origin and non-origin locations. We also highlight a major concern with the currently regarded gold standard test set (ground truth) methodology, introduce a new data set, and identify further research avenues for advancing the area.

    Comment: To appear in Proceedings of the IEEE Big Data Conference 2022
    Keywords Computer Science - Computation and Language ; Computer Science - Social and Information Networks
    Subject code 004
    Publishing date 2022-11-17
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  10. Book ; Online: Twitter conversations predict the daily confirmed COVID-19 cases

    Lamsal, Rabindra / Harwood, Aaron / Read, Maria Rodriguez

    2022  

    Abstract: As of writing this paper, COVID-19 (Coronavirus disease 2019) has spread to more than 220 countries and territories. Following the outbreak, the pandemic's seriousness has made people more active on social media, especially on the microblogging platforms ...

    Abstract As of writing this paper, COVID-19 (Coronavirus disease 2019) has spread to more than 220 countries and territories. Following the outbreak, the pandemic's seriousness has made people more active on social media, especially on the microblogging platforms such as Twitter and Weibo. The pandemic-specific discourse has remained on-trend on these platforms for months now. Previous studies have confirmed the contributions of such socially generated conversations towards situational awareness of crisis events. The early forecasts of cases are essential to authorities to estimate the requirements of resources needed to cope with the outgrowths of the virus. Therefore, this study attempts to incorporate the public discourse in the design of forecasting models particularly targeted for the steep-hill region of an ongoing wave. We propose a sentiment-involved topic-based methodology for designing multiple time series from publicly available COVID-19 related Twitter conversations. As a use case, we implement the proposed methodology on Australian COVID-19 daily cases and Twitter conversations generated within the country. Experimental results: (i) show the presence of latent social media variables that Granger-cause the daily COVID-19 confirmed cases, and (ii) confirm that those variables offer additional prediction capability to forecasting models. Further, the results show that the inclusion of social media variables for modeling introduces 48.83--51.38% improvements on RMSE over the baseline models. We also release the large-scale COVID-19 specific geotagged global tweets dataset, MegaGeoCOV, to the public anticipating that the geotagged data of this scale would aid in understanding the conversational dynamics of the pandemic through other spatial and temporal contexts.

    Comment: Preprint under review at an Elsevier Journal
    Keywords Computer Science - Computation and Language ; Computer Science - Social and Information Networks
    Subject code 306
    Publishing date 2022-06-21
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

To top