Author Identification Using Machine Learning
Keywords:
Author identification, SVM, NLP, Machine Learning, Feature ExtractionAbstract
Identifying the writers of a piece of writing, whether anonymous or not, is a procedure that focuses solely on the writing style and not on the content itself. Most of the time, writing and speaking style may also be seen as techniques of underlying sentence construction, which can be evaluated using aspects such as vocabulary, length of sentences, and sequence of words, richness, and word frequency usage. The primary goal of this article is to examine and apply a variety of categorization approaches to research articles that analyze author identity and the content of those texts that are in dispute. Researchers' earlier work is also discussed and elaborated upon. After that, we were able to exhibit better findings from our experiments. With feature spaces, the SVM technique is particularly well-suited to the aforementioned problem. Across all experiments, it was found that the SVM was effective at determining the authorship of research publications. SVM was used to classify two sets of data in that study. Sections A and B of the experiment are referred to as Experiment A and B, respectively. 500 research papers from conferences and the majority of them from Google Scholar are included in Experiment A. After applying data mining techniques to the gathered dataset, we have a final set of 400 research papers. A, B, and C are the three subsets of the 400 research papers for high performance that were further subdivided. Our model was able to train quickly and evaluate good performance on these limited research criteria in these datasets as a result of an increase in both the number of authors and the number of publications that were included in the dataset.
Downloads
Published
How to Cite
Issue
Section
License
This is an open Access Article published by Research Center of Computing & Biomedical Informatics (RCBI), Lahore, Pakistan under CCBY 4.0 International License