Sentiment Analysis of the Twitter Dataset for the Prediction of Sentiments

Srinivasa Sai Abhijit Challapalli

doi:10.69996/jsihs.2024017

Authors

Srinivasa Sai Abhijit Challapalli Student, The University of Texas at Arlington, Arlington, Texas, United States of America

DOI:

https://doi.org/10.69996/jsihs.2024017

Keywords:

Sentimental Analysis, long short-term memory (lstm), classification, deep learning, bidirectional

Abstract

Sentiment analysis of Twitter data involves using natural language processing (NLP) and machine learning techniques to classify the sentiments expressed in tweets. The goal is to classify tweets based on sentiment, emotion, or behavior in the text, using emotional labels such as positive, negative, or neutral. Twitter sentiment analysis is an important tool for instantly understanding public opinion on various topics, events, or brands. This process usually begins with the collection of large tweet data, followed by preliminary steps such as tokenization, outlier removal, and text normalization to clean the data. The text is then converted into a digital representation suitable for machine learning models using extraction techniques such as Bag-of-Words, Time-Inverse Document Frequency (TF-IDF), and word embedding. Deep learning models, such as convolutional neural networks (CNN), short-term neural networks (LSTM), and bidirectional LSTM (BiLSTM), are generally used to train and predict sentiment. This paper presents an effective method for sentiment analysis of Twitter profiles using deep learning methods, especially Convolutional Neural Networks (CNN), Long-Term Memory (LSTM) and Bidirectional LSTM (BiLSTM). The database contains 7000 tweets, which are pre-processed using text cleaning methods including tokenization, word removal, and lemmatization. The data is then converted to numerical vectors by methods such as bag-of-words and word embedding. The model is trained, and the test accuracy of CNN model is 0.95, test accuracy is 0.92, training accuracy of LSTM model is 0.97 and test accuracy is 0.90, and the training accuracy of BiLSTM model is 1.0 tab, and the accuracy rate is 0.9. The results show a tendency to overdo it, with the model performing well on training data but poorly on test data. However, the model successfully classified tweets into positive, negative, and neutral groups, demonstrating the potential of deep learning in capturing sentiment from social media.

References

[1] K.L.Tan, C.P. Lee and K.M. Lim, “A survey of sentiment analysis: Approaches, datasets, and future research,” Applied Sciences, vol.13, no.7, pp.4550, 2023.

[2] M. I. Al-mashhadani, K.M. Hussein and E. T. Khudir, “Sentiment analysis using optimized feature sets in different facebook/twitter dataset domains using big data,” Iraqi Journal For Computer Science and Mathematics, vol.3, no.1, pp.64-70, 2022.

[3] T. Alqurashi, “Arabic sentiment analysis for twitter data: a systematic literature review,” Engineering, Technology and Applied Science Research, vol.13, no.2, pp.10292-10300, 2023.

[4] Y. Guo, S. Das, S. Lakamana and A. Sarker, “An aspect-level sentiment analysis dataset for therapies on Twitter,” Data in Brief, vol.50, pp.109618, 2023.

[5] S.H. Muhammad, I. Abdulmumin, A.A. Ayele, N. Ousidhoum, D.I. Adelani et al., “Afrisenti: A twitter sentiment analysis benchmark for african languages,” arXiv preprint arXiv:2302.08956, 2023.

[6] W. Aljedaani, F. Rustam, M.W. Mkaouer, A. Ghallab, V. Rupapara et al., “Sentiment analysis on Twitter data integrating TextBlob and deep learning models: The case of US airline industry,” Knowledge-Based Systems, vol.255, pp.109780, 2022.

[7] Y. Wang, J. Guo, C. Yuan and B. Li, “Sentiment analysis of Twitter data,” Applied Sciences, vol.12, no.22, pp.11775, 2022.

[8] Z. B. Nezhad and M.A. Deihimi, “Twitter sentiment analysis from Iran about COVID 19 vaccine,” Diabetes and Metabolic Syndrome: Clinical Research and Reviews, vol.16, no.1, pp.102367, 2023.

[9] S. Singh, K. Kumar and B. Kumar, “Sentiment analysis of Twitter data using TF-IDF and machine learning techniques,” In 2022 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COM-IT-CON), vol.1, pp.252-255, 2023.

[10] A.P. Rodrigues and N.N. Chiplunkar, “A new big data approach for topic classification and sentiment analysis of Twitter data,” Evolutionary Intelligence, pp.1-11, 2022.

[11] Y. Pei, A. Mbakwe, A. Gupta, S. Alamir, H. Lin, X. Liu et al., “Tweetfinsent: A dataset of stock sentiments on twitter,” In Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP), pp.37-47, 2022.

[12] P. Gaur, S. Vashistha and P. Jha, “Twitter sentiment analysis using naive bayes-based machine learning technique,” In Sentiment Analysis and Deep Learning: Proceedings of ICSADL 2022, pp.367-376, 2023.

[13] N. Parveen, P. Chakrabarti, B.T. Hung and A. Shaik, “Twitter sentiment analysis using hybrid gated attention recurrent network,” Journal of Big Data, vol.10, no.1, pp.50, 2023.

[14] S. Mann, J. Arora, M. Bhatia, R. Sharma and R. Taragi, “Twitter sentiment analysis using enhanced bert,” In Intelligent Systems and Applications: Select Proceedings of ICISA 2022, pp.263-271, 2023.

[15] A. Yavari, H. Hassanpour, B. Rahimpour Cami and M. Mahdavi, “Election prediction based on sentiment analysis using twitter data,” International Journal of Engineering, vol.35, no.2, pp.372-379, 2022.

[16] A. Simarmata, A. Xu and M.E. Phanie, “Sentiment analysis on twitter posts about the Russia and Ukraine war with long short-term memory,” Sinkron: Jurnal dan Penelitian Teknik Informatika, vol.7, no.2, pp.789-797, 2024.

[17] M. Bibi, W.A. Abbasi, W. Aziz, S. Khalil, M. Uddin et al., “A novel unsupervised ensemble framework using concept-based linguistic methods and machine learning for twitter sentiment analysis,” Pattern Recognition Letters, vol.158, pp.80-86, 2022.

[18] M. H. Ali Al-Abyadh, M. A. Iesa, H. A. Hafeez Abdel Azeem, D. P. Singh, P. Kumar et al., “Deep sentiment analysis of twitter data using a hybrid ghost convolution neural network model,” Computational Intelligence and Neuroscience, vol.2022, no.1, pp.6595799, 2022.

```

Sentiment Analysis of the Twitter Dataset for the Prediction of Sentiments

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite

Make a Submission

Submission via Email

Impact Factor

Keywords