Islamophobia Content Detection Using Natural Language Processing
Keywords:
BERT, Islamophobia, LSTM, TwitterAbstract
With growing hate and discrimination based on caste and race, Islamophobia is one of the major and most populated phenomena nowadays. Islamophobia, in general refers to irrational antagonism, fear, or hate of Islam, Muslims, and Islamic culture, as well as actual discrimination against these groups or people within them. Islamophobia has been steadily rising over the last five years, and this trend has continued during the last year or so. A little decline occurred at the start of the year, mid-year, and towards the end of 2021, indicating that the tendency is very variable over the months, but the general trend is growing. Furthermore, in terms of magnitude, Europe deserves special attention, followed by Asia and North America. There are various researches targeting the above-mentioned issue but it always feels that this particular domain needs to be addressed more and proper systems or filters should be created to avoid any gender, race or culture-based discrimination at least from the social space used by millions. This research focuses on identifying Islamophobic content over social media and twitter in particular. In this article, the domain of islamophobia is explored on the social platform especially Twitter. To our knowledge, this is one of the very few studies that addresses Islamophobia using such advanced algorithms (BERT). Our objective is to find a model that can appropriately categorize Islamophobic tweets collected from Twitter. Initially, we constructed a dataset by extracting tweets through the use of specific keywords. Next, we classified the extracted data as either hateful or non-hateful based on whether or not it displayed Islamophobia. Subsequently, the dataset underwent pre-processing to decrease any extraneous information, such as punctuation, stop words, empty entries, and duplicates. Following that, two models, LSTM and BERT were implemented on the dataset and LSTM yielded an accuracy of 93.3 percent and BERT yielded 97.1 percent accuracy.
Downloads
Published
How to Cite
Issue
Section
License
This is an open Access Article published by Research Center of Computing & Biomedical Informatics (RCBI), Lahore, Pakistan under CCBY 4.0 International License