Public Opinions on Vaccine

A Machine Learning-based Analysis of U.S. Tweets

Where Informatics Borders Health Services Research

AMIA 2022 | Paper
🏆 Second Place in Student Paper Competition Finalist
Role: Undergraduate Research Assistant
Topic: Public Opinions toward COVID-19 Vaccine Mandates: A Machine Learning-based Analysis of U.S. Tweets.
Methods: Sentiment analysis, Qualitative content analysis.
Duration: Dec 2021 - Mar 2022.

Description: This study utilized machine learning-based analysis of tweets from the US leading up to and following the Biden Administration’s announcement of federal vaccine mandates. Additionally, a qualitative content analysis of a random sample of relevant tweets was conducted to explore the beliefs held by Twitter users regarding vaccine mandates and the evidence used to support their positions. Results revealed that while 30% of users supported vaccine mandates, a majority held differing opinions. Concerns included political motives, personal liberty infringement, and vaccine ineffectiveness in preventing infection.

Empirical Contribution:

  • Significant margin for improvement in public health communication.
    • 70% of the tweets circulated expressed negative opinions on vaccination mandates.
  • Lessons learned to inform better communication practices.
    • public reaction was constantly evolving.
    • some of the negative opinions were misguided or based on outdated information.
    • some were provoked due to unrealistic expectations or over-promised benefits.
    • some frequently cited oppositions, such as religious beliefs were not mentioned often in the data.

Methodology:

  • Data collection and preprocessing.
    • Retrieve tweets relevant to COVID-19 vaccine mandates.
  • Two-stage machine-learning classification.
    • Remove irrelevant tweets & sentiment analysis.
  • Longitudinal analyses and topic modeling.
    • Analysis evolution of public opinions.
  • Qualitative content analysis.
    • Discover evidence used to support user positions.
Data collection
Two-stage machine-learning classification
Longitudinal analyses and topic modeling
Qualitative content analysis

Results:

Tweets containing personal opinions - part 1
Tweets containing personal opinions - part 2

Topic modeling of incubating phases Topic(07/14-09/08/2021)

  • Current and potential mandate policies.
    • Representative words: delta, government, right, forcing, choice, body.
  • Vaccination requirements for business reopen and public events.
    • Representative words: business, company, workers, players, NFL.
  • Vaccination requirements for school reopening.
    • Representative words: students, teachers, schools, safe, keep, wear, masks.

Topic modeling of promulgation phase Topic(09/09-11/04/2021)

  • Liberty restriction and legality.
    • Representative words: free, freedom, illegal, immigrants, border, Biden, force, state.
  • Populations affected.
    • Representative words: workers, staff, businesses, companies, citizens, military, federal.
  • Effectiveness and safety.
    • Representative words: diseases, experimental, herd, side, effects, immunity.

Topic modeling of aftermath phase Topic (11/05 –12/31/2021)

  • legality
    • Representative words: military, federal, choice, employee, Biden, unconstitutional, right, government, illegal.
  • Anti-vaccine / anti-mandates protests.
    • Representative words: anti, stop, right, transmission, force, right, health, natural, immunity.
  • Effects on travel
    • Representative words: travel, flights, infection, spread, domestic, Trump.
Qualitative content analysis — Positive themes
Qualitative content analysis — Negative themes

Acknowledgment: The material(pictures) was created by Yawen Guo(lead author) during her presentation at the AMIA conference(2022). Much appreciated!