Application of the Naïve Bayes Classifier Algorithm to Analyze Sentiment for the Covid-19 Vaccine on Twitter in Jakarta
DOI:
https://doi.org/10.25124/ijies.v7i01.171Keywords:
Sentiment Analysis; Text Pre-Processing; Naïve Bayes Classifier; TF-IDF; TwitterAbstract
The epidemic of a new disease caused by the coronavirus (2019-nCoV), commonly referred to as
COVID-19, has been declared a global virus epidemic by the World Health Organization (WHO).
President Joko Widodo has officially ratified Presidential Decree No. 99 of 2020 concerning the
provision of vaccines and the implementation of vaccination activities. Twitter is a social media
platform that allows users to share information and opinions directly with fellow users. Tweets given
can be in any form, either positively or negatively, so one of the methods used is sentiment analysis.
Sentiment analysis helps determine an opinion or comment on an issue, whether the response is
positive or negative. The Naïve Bayes algorithm is used in sentiment analysis because it is suitable
for tweets or text data that is not too long or short text. The initial stage of sentiment analysis is text
pre-processing which consists of Cleaning, case folding, tokenizing, and stopword removal. Then the
data is labeled manually. The analysis results are visualized as bar charts, pie charts, and word clouds.
Then the word weighting is carried out using the term frequency-inverse document (TF-IDF), and
classification is carried out using the Naïve Bayes classifier. From the test results, the accuracy value
of the confusion matrix is 82% from 2600 tweet data with 80% training data composition and 20%
test data.