People today receive news mainly through online platforms, where information spreads within seconds. Because content can be shared so easily, false information often reaches a large audience before it can be verified. This situation makes it essential to develop systems that can automatically identify unreliable news. The proposed model focuses on separating genuine information from misleading content using data analysis techniques in R. Instead of depending on manual checking, the system studies how the text is written, observes unusual wording patterns, and evaluates whether the publishing source appears trustworthy. As the volume of online data continues to grow, human verification alone cannot handle the workload. For this reason, the model applies Natural Language Processing and machine learning methods to examine article content. It reviews word usage trends, writing style variations, and emotional signals present in the text. In addition, the system compares articles with verified reference data and analyzes previous publication records to estimate whether a news item is likely to be false.
Keywords
Fake News Detection, Data Analytics, Logistic Regression, R Language, Text Classification
Conclusion
The Fake News Detection System designed in this study highlights the practical application of machine learning and natural language processing methods in recognizing misleading information across digital platforms. Implemented using the R programming framework, the system performs text preprocessing, derives relevant features, and applies Logistic Regression to categorize news content as either genuine or false. This approach minimizes reliance on manual verification and offers an automated, scalable, and time-efficient mechanism for misinformation detection.
Experimental findings indicate that feature engineering methods, including TF-IDF representations and n-gram analysis, contribute substantially to improved classification performance. Incorporating source credibility assessment further strengthens prediction dependability. Additionally, visualization and evaluation components help interpret misinformation trends and assess overall model effectiveness.
References
[1] โFake News Detection on Social Media: A Data Mining Perspectiveโ โ Shu, K., Sliva, A., Wang, S., Tang, J., Liu, H. In: ACM SIGKDD Explorations Newsletter, Vol. 19, No. 1, pp. 22โ36 (2017).
[2] โAutomatic Deception Detection: Methods for Finding Fake Newsโ โ Conroy, N.J., Rubin, V.L., Chen, Y. In: Proceedings of the Association for Information Science and Technology, Vol. 52, No. 1, pp. 1โ4 (2015).
[3] โDetecting Opinion Spams and Fake News Using Text Classificationโ โ Ahmed, H., Traore, I., Saad, S. In: Security and Privacy, Vol. 1, No. 1, pp. 1โ8 (2018).
[4] โInformation Credibility on Twitterโ โ Castillo, C., Mendoza, M., Poblete, B. In: Proceedings of the International World Wide Web Conference, pp. 675โ684 (2011).
[5] โThe Spread of True and False News Onlineโ โ Vosoughi, S., Roy, D., Aral, S. In: Science, Vol. 359, No. 6380, pp. 1146โ1151 (2018).
[6] โFake News Detection Using Deep Neural Networksโ โ Kaliyar, R.K., Goswami, A., Narang, P. In: Procedia Computer Science, Vol. 132, pp. 106โ113 (2018).
[7] โR: A Language and Environment for Statistical Computingโ โ R Core Team. R Foundation for Statistical Computing, Vienna, Austria (2023).
[8] โSpeech and Language Processing (3rd Edition)โ โ Jurafsky, D., Martin, J.H. Pearson (2021).
[9] โAn Introduction to Statistical Learningโ โ James, G., Witten, D., Hastie, T., Tibshirani, R. Springer (2013).
[10] โScikit-learn: Machine Learning in Python โ Documentationโ โ Scikit-learn Developers. Available at: https://scikit-learn.org