Islamophobia Content Detection Using Natural Language Processing

Authors

  • Abdul Jaleel Department of Computer Science, University of Engineering and Technology, Taxila, 47080, Pakistan
  • Mehmoon Anwar Department of Computer Science, University of Engineering and Technology, Taxila, 47080, Pakistan
  • Farooq Ali Department of Computer Science, University of Engineering and Technology, Taxila, 47080, Pakistan
  • Raza Mukhtar Department of Computer Science, University of Engineering and Technology, Taxila, 47080, Pakistan
  • Muhammad Farooq Department of Information Technology, University of the Punjab, Lahore Pakistan

Keywords:

BERT, Islamophobia, LSTM, Twitter

Abstract

With growing hate and discrimination based on caste and race, Islamophobia is one of the major and most populated phenomena nowadays. Islamophobia, in general refers to irrational antagonism, fear, or hate of Islam, Muslims, and Islamic culture, as well as actual discrimination against these groups or people within them. Islamophobia has been steadily rising over the last five years, and this trend has continued during the last year or so. A little decline occurred at the start of the year, mid-year, and towards the end of 2021, indicating that the tendency is very variable over the months, but the general trend is growing. Furthermore, in terms of magnitude, Europe deserves special attention, followed by Asia and North America. There are various researches targeting the above-mentioned issue but it always feels that this particular domain needs to be addressed more and proper systems or filters should be created to avoid any gender, race or culture-based discrimination at least from the social space used by millions. This research focuses on identifying Islamophobic content over social media and twitter in particular. In this article, the domain of islamophobia is explored on the social platform especially Twitter. To our knowledge, this is one of the very few studies that addresses Islamophobia using such advanced algorithms (BERT). Our objective is to find a model that can appropriately categorize Islamophobic tweets collected from Twitter. Initially, we constructed a dataset by extracting tweets through the use of specific keywords. Next, we classified the extracted data as either hateful or non-hateful based on whether or not it displayed Islamophobia. Subsequently, the dataset underwent pre-processing to decrease any extraneous information, such as punctuation, stop words, empty entries, and duplicates. Following that, two models, LSTM and BERT were implemented on the dataset and LSTM yielded an accuracy of 93.3 percent and BERT yielded 97.1 percent accuracy.

Downloads

Published

2023-03-29

How to Cite

Abdul Jaleel, Mehmoon Anwar, Farooq Ali, Raza Mukhtar, & Muhammad Farooq. (2023). Islamophobia Content Detection Using Natural Language Processing. Journal of Computing & Biomedical Informatics, 4(02), 88–97. Retrieved from https://jcbi.org/index.php/Main/article/view/130