Fake News Classification using Machine Learning: Count Vectorizer and Support Vector Machine

Authors

  • Sajid Khan Department of Computer Science, University of Engineering & Technology Taxila, Pakistan.
  • Mehmoon Anwar Department of Computer Science, University of Engineering & Technology Taxila, Pakistan.
  • Huma Qayyum Department of Software Engineering, University of Engineering & Technology Taxila, Pakistan.
  • Farooq Ali Department of Computer Science,University of Engineering & Technology Taxila, Pakistan.
  • Marriam Nawaz Department of Computer Science, University of Engineering & Technology Taxila, Pakistan.

DOI:

https://doi.org/10.56979/401/2022/85

Keywords:

NLP, Fake Articles, Machine Learning, SVM, Classification, Count vectorizer

Abstract

The quick advancement of the internet facility and rapid uptake of social networking sites like Twitter and Facebook has led to the generation of an extensive amount of data never seen previously in the history of humanity. Users are producing and disseminating more content than ever because of the widespread use of digital platforms, some of which are false and spreading wrong narratives. Accurate categorization of textual documents as fabrication or falsehood is a complicated job. The majority of the research emphasizes certain databases or topics, most notably the area of politics. As a result, the trained approaches perform effectively on a specific category of documents field and do not generate robust performance when evaluated to articles from other areas. We proposed a solution to the false news identification task in the presented study by combining NLP and machine learning technologies. In the first phase, the data is pre-processed by employing steps like removing null, duplicate, values, and punctuation marks. After this, the clean data is converted into numeric representation using a count vectorizer (CV) and Tf-Idf vectorizer. While for the classification task, we have used the SVM classifier. Our proposed solution performed well and catered to all possible fake news domains.

Downloads

Published

2022-12-29

How to Cite

Sajid Khan, Mehmoon Anwar, Huma Qayyum, Farooq Ali, & Marriam Nawaz. (2022). Fake News Classification using Machine Learning: Count Vectorizer and Support Vector Machine. Journal of Computing & Biomedical Informatics, 4(01), 54–63. https://doi.org/10.56979/401/2022/85