LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 15

Search options

  1. Book ; Online: FakeClaim

    Shahi, Gautam Kishore / Jaiswal, Amit Kumar / Mandl, Thomas

    A Multiple Platform-driven Dataset for Identification of Fake News on 2023 Israel-Hamas War

    2024  

    Abstract: We contribute the first publicly available dataset of factual claims from different platforms and fake YouTube videos on the 2023 Israel-Hamas war for automatic fake YouTube video classification. The FakeClaim data is collected from 60 fact-checking ... ...

    Abstract We contribute the first publicly available dataset of factual claims from different platforms and fake YouTube videos on the 2023 Israel-Hamas war for automatic fake YouTube video classification. The FakeClaim data is collected from 60 fact-checking organizations in 30 languages and enriched with metadata from the fact-checking organizations curated by trained journalists specialized in fact-checking. Further, we classify fake videos within the subset of YouTube videos using textual information and user comments. We used a pre-trained model to classify each video with different feature combinations. Our best-performing fine-tuned language model, Universal Sentence Encoder (USE), achieves a Macro F1 of 87\%, which shows that the trained model can be helpful for debunking fake videos using the comments from the user discussion. The dataset is available on Github\footnote{https://github.com/Gautamshahi/FakeClaim}

    Comment: Accepted in the IR4Good Track at the 46th European Conference on Information Retrieval (ECIR) 2024
    Keywords Computer Science - Information Retrieval ; Computer Science - Social and Information Networks
    Subject code 004
    Publishing date 2024-01-29
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  2. Article: AMUSED: An Annotation Framework of Multi-modal Social Media Data

    Shahi, Gautam Kishore

    Abstract: In this paper, we present a semi-automated framework called AMUSED for gathering multi-modal annotated data from the multiple social media platforms. The framework is designed to mitigate the issues of collecting and annotating social media data by ... ...

    Abstract In this paper, we present a semi-automated framework called AMUSED for gathering multi-modal annotated data from the multiple social media platforms. The framework is designed to mitigate the issues of collecting and annotating social media data by cohesively combining machine and human in the data collection process. From a given list of the articles from professional news media or blog, AMUSED detects links to the social media posts from news articles and then downloads contents of the same post from the respective social media platform to gather details about that specific post. The framework is capable of fetching the annotated data from multiple platforms like Twitter, YouTube, Reddit. The framework aims to reduce the workload and problems behind the data annotation from the social media platforms. AMUSED can be applied in multiple application domains, as a use case, we have implemented the framework for collecting COVID-19 misinformation data from different social media platforms.
    Keywords covid19
    Publisher ArXiv
    Document type Article
    Database COVID19

    Kategorien

  3. Book ; Online: AMUSED

    Shahi, Gautam Kishore

    An Annotation Framework of Multi-modal Social Media Data

    2020  

    Abstract: In this paper, we present a semi-automated framework called AMUSED for gathering multi-modal annotated data from the multiple social media platforms. The framework is designed to mitigate the issues of collecting and annotating social media data by ... ...

    Abstract In this paper, we present a semi-automated framework called AMUSED for gathering multi-modal annotated data from the multiple social media platforms. The framework is designed to mitigate the issues of collecting and annotating social media data by cohesively combining machine and human in the data collection process. From a given list of the articles from professional news media or blog, AMUSED detects links to the social media posts from news articles and then downloads contents of the same post from the respective social media platform to gather details about that specific post. The framework is capable of fetching the annotated data from multiple platforms like Twitter, YouTube, Reddit. The framework aims to reduce the workload and problems behind the data annotation from the social media platforms. AMUSED can be applied in multiple application domains, as a use case, we have implemented the framework for collecting COVID-19 misinformation data from different social media platforms.

    Comment: 10 pages, 5 figures, 3 tables
    Keywords Computer Science - Social and Information Networks ; Computer Science - Computation and Language ; Computer Science - Information Retrieval ; covid19
    Subject code 004
    Publishing date 2020-10-01
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  4. Article ; Online: An exploratory study of COVID-19 misinformation on Twitter.

    Shahi, Gautam Kishore / Dirkson, Anne / Majchrzak, Tim A

    Online social networks and media

    2021  Volume 22, Page(s) 100104

    Abstract: During the COVID-19 pandemic, social media has become a home ground for misinformation. To tackle this infodemic, scientific oversight, as well as a better understanding by practitioners in crisis management, is needed. We have conducted an exploratory ... ...

    Abstract During the COVID-19 pandemic, social media has become a home ground for misinformation. To tackle this infodemic, scientific oversight, as well as a better understanding by practitioners in crisis management, is needed. We have conducted an exploratory study into the propagation, authors and content of misinformation on Twitter around the topic of COVID-19 in order to gain early insights. We have collected all tweets mentioned in the verdicts of fact-checked claims related to COVID-19 by over 92 professional fact-checking organisations between January and mid-July 2020 and share this corpus with the community. This resulted in 1500 tweets relating to 1274 false and 226 partially false claims, respectively. Exploratory analysis of author accounts revealed that the verified twitter handle(including Organisation/celebrity) are also involved in either creating(new tweets) or spreading(retweet) the misinformation. Additionally, we found that false claims propagate faster than partially false claims. Compare to a background corpus of COVID-19 tweets, tweets with misinformation are more often concerned with discrediting other information on social media. Authors use less tentative language and appear to be more driven by concerns of potential harm to others. Our results enable us to suggest gaps in the current scientific coverage of the topic as well as propose actions for authorities and social media users to counter misinformation.
    Language English
    Publishing date 2021-02-19
    Publishing country United States
    Document type Journal Article
    ISSN 2468-6964
    ISSN (online) 2468-6964
    DOI 10.1016/j.osnem.2020.100104
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  5. Book ; Online: FakeCovid- A Multilingual Cross domain Fact Check Dataset for COVID-19

    Shahi, Gautam Kishore / Nandini, Durgesh

    2020  

    Abstract: FakeCovid is the first multilingual cross-domain dataset of 7623 fact-checked news articles for COVID-19, collected from 04/01/2020 to 01/07/2020. We have collected the fact-checked articles from 92 fact-checking websites after obtaining references from ... ...

    Abstract FakeCovid is the first multilingual cross-domain dataset of 7623 fact-checked news articles for COVID-19, collected from 04/01/2020 to 01/07/2020. We have collected the fact-checked articles from 92 fact-checking websites after obtaining references from Poynter and Snopes. We have manually annotated the collected articles into 11 categories of the fact-checked news according to their content. We ultimately generated dataset is in 40 languages from 105 countries.
    Keywords Fake News ; Coronavirus ; COVID19 ; Fact check ; misinformation ; multilingual ; covid19
    Language English
    Publishing date 2020-07-29
    Publishing country eu
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  6. Article ; Online: FakeCovid -- A Multilingual Cross-domain Fact Check News Dataset for COVID-19

    Shahi, Gautam Kishore / Nandini, Durgesh

    Abstract: In this paper, we present a first multilingual cross-domain dataset of 5182 fact-checked news articles for COVID-19, collected from 04/01/2020 to 15/05/2020. We have collected the fact-checked articles from 92 different fact-checking websites after ... ...

    Abstract In this paper, we present a first multilingual cross-domain dataset of 5182 fact-checked news articles for COVID-19, collected from 04/01/2020 to 15/05/2020. We have collected the fact-checked articles from 92 different fact-checking websites after obtaining references from Poynter and Snopes. We have manually annotated articles into 11 different categories of the fact-checked news according to their content. The dataset is in 40 languages from 105 countries. We have built a classifier to detect fake news and present results for the automatic fake news detection and its class. Our model achieves an F1 score of 0.76 to detect the false class and other fact check articles. The FakeCovid dataset is available at Github.
    Keywords covid19
    Publisher ArXiv
    Document type Article ; Online
    DOI 10.36190/2020.14
    Database COVID19

    Kategorien

  7. Book ; Online: FakeCovid -- A Multilingual Cross-domain Fact Check News Dataset for COVID-19

    Shahi, Gautam Kishore / Nandini, Durgesh

    2020  

    Abstract: In this paper, we present a first multilingual cross-domain dataset of 5182 fact-checked news articles for COVID-19, collected from 04/01/2020 to 15/05/2020. We have collected the fact-checked articles from 92 different fact-checking websites after ... ...

    Abstract In this paper, we present a first multilingual cross-domain dataset of 5182 fact-checked news articles for COVID-19, collected from 04/01/2020 to 15/05/2020. We have collected the fact-checked articles from 92 different fact-checking websites after obtaining references from Poynter and Snopes. We have manually annotated articles into 11 different categories of the fact-checked news according to their content. The dataset is in 40 languages from 105 countries. We have built a classifier to detect fake news and present results for the automatic fake news detection and its class. Our model achieves an F1 score of 0.76 to detect the false class and other fact check articles. The FakeCovid dataset is available at Github.

    Comment: CySoc 2020 International Workshop on Cyber Social Threats, ICWSM 2020
    Keywords Computer Science - Computers and Society ; Computer Science - Social and Information Networks ; covid19
    Publishing date 2020-06-19
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  8. Book ; Online: Regret, Delete, (Do Not) Repeat

    Ferreyra, Nicolás E. Díaz / Shahi, Gautam Kishore / Tony, Catherine / Stieglitz, Stefan / Scandariato, Riccardo

    An Analysis of Self-Cleaning Practices on Twitter After the Outbreak of the COVID-19 Pandemic

    2023  

    Abstract: During the outbreak of the COVID-19 pandemic, many people shared their symptoms across Online Social Networks (OSNs) like Twitter, hoping for others' advice or moral support. Prior studies have shown that those who disclose health-related information ... ...

    Abstract During the outbreak of the COVID-19 pandemic, many people shared their symptoms across Online Social Networks (OSNs) like Twitter, hoping for others' advice or moral support. Prior studies have shown that those who disclose health-related information across OSNs often tend to regret it and delete their publications afterwards. Hence, deleted posts containing sensitive data can be seen as manifestations of online regrets. In this work, we present an analysis of deleted content on Twitter during the outbreak of the COVID-19 pandemic. For this, we collected more than 3.67 million tweets describing COVID-19 symptoms (e.g., fever, cough, and fatigue) posted between January and April 2020. We observed that around 24% of the tweets containing personal pronouns were deleted either by their authors or by the platform after one year. As a practical application of the resulting dataset, we explored its suitability for the automatic classification of regrettable content on Twitter.

    Comment: Accepted at CHI '23 Late Breaking Work (LBW)
    Keywords Computer Science - Social and Information Networks ; Computer Science - Human-Computer Interaction
    Publishing date 2023-03-16
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  9. Book ; Online: Who shapes crisis communication on Twitter? An analysis of influential German-language accounts during the COVID-19 pandemic

    Shahi, Gautam Kishore / Clausen, Sünje / Stieglitz, Stefan

    2021  

    Abstract: Twitter is becoming an increasingly important platform for disseminating information during crisis situations, such as the COVID-19 pandemic. Effective crisis communication on Twitter can shape the public perception of the crisis, influence adherence to ... ...

    Abstract Twitter is becoming an increasingly important platform for disseminating information during crisis situations, such as the COVID-19 pandemic. Effective crisis communication on Twitter can shape the public perception of the crisis, influence adherence to preventative measures, and thus affect public health. Influential accounts are particularly important as they reach large audiences quickly. This study identifies influential German-language accounts from almost 3 million German tweets collected between January and May 2020 by constructing a retweet network and calculating PageRank centrality values. We capture the volatility of crisis communication by structuring the analysis into seven stages based on key events during the pandemic and profile influential accounts into roles. Our analysis shows that news and journalist accounts were influential throughout all phases, while government accounts were particularly important shortly before and after the lockdown was instantiated. We discuss implications for crisis communication during health crises and for analyzing long-term crisis data.

    Comment: 10 pages
    Keywords Computer Science - Social and Information Networks ; Computer Science - Computers and Society
    Subject code 306
    Publishing date 2021-09-12
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  10. Article ; Online: The Networked Context of COVID-19 Misinformation: Informational Homogeneity on YouTube at the Beginning of the Pandemic.

    Röchert, Daniel / Shahi, Gautam Kishore / Neubaum, German / Ross, Björn / Stieglitz, Stefan

    Online social networks and media

    2021  Volume 26, Page(s) 100164

    Abstract: During the coronavirus disease 2019 (COVID-19) pandemic, the video-sharing platform YouTube has been serving as an essential instrument to widely distribute news related to the global public health crisis and to allow users to discuss the news with each ... ...

    Abstract During the coronavirus disease 2019 (COVID-19) pandemic, the video-sharing platform YouTube has been serving as an essential instrument to widely distribute news related to the global public health crisis and to allow users to discuss the news with each other in the comment sections. Along with these enhanced opportunities of technology-based communication, there is an overabundance of information and, in many cases, misinformation about current events. In times of a pandemic, the spread of misinformation can have direct detrimental effects, potentially influencing citizens' behavioral decisions (e.g., to not socially distance) and putting collective health at risk. Misinformation could be especially harmful if it is distributed in isolated news cocoons that homogeneously provide misinformation in the absence of corrections or mere accurate information. The present study analyzes data gathered at the beginning of the pandemic (January-March 2020) and focuses on the network structure of YouTube videos and their comments to understand the level of informational homogeneity associated with misinformation on COVID-19 and its evolution over time. This study combined machine learning and network analytic approaches. Results indicate that nodes (either individual users or channels) that spread misinformation were usually integrated in heterogeneous discussion networks, predominantly involving content other than misinformation. This pattern remained stable over time. Findings are discussed in light of the COVID-19 "infodemic" and the fragmentation of information networks.
    Language English
    Publishing date 2021-08-30
    Publishing country United States
    Document type Journal Article
    ISSN 2468-6964
    ISSN (online) 2468-6964
    DOI 10.1016/j.osnem.2021.100164
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

To top