An R-Based Intelligent Data Analytics Model For Fake News Detection | IJCSE Volume 10 – Issue 1 | IJCSE-V10I1P17
Table of Contents
ToggleInternational Journal of Computer Science Engineering Techniques
ISSN: 2455-135X
Volume 10, Issue 1
|
Published:
Author
R.Jagadeesh, Dr.S.Selvakani, Mrs.K.Vasumathi
Abstract
People today receive news mainly through online platforms, where information spreads within seconds. Because content can be shared so easily, false information often reaches a large audience before it can be verified. This situation makes it essential to develop systems that can automatically identify unreliable news. The proposed model focuses on separating genuine information from misleading content using data analysis techniques in R. Instead of depending on manual checking, the system studies how the text is written, observes unusual wording patterns, and evaluates whether the publishing source appears trustworthy. As the volume of online data continues to grow, human verification alone cannot handle the workload. For this reason, the model applies Natural Language Processing and machine learning methods to examine article content. It reviews word usage trends, writing style variations, and emotional signals present in the text. In addition, the system compares articles with verified reference data and analyzes previous publication records to estimate whether a news item is likely to be false.
Keywords
Fake News Detection, Data Analytics, Logistic Regression, R Language, Text ClassificationConclusion
The Fake News Detection System designed in this study highlights the practical application of machine learning and natural language processing methods in recognizing misleading information across digital platforms. Implemented using the R programming framework, the system performs text preprocessing, derives relevant features, and applies Logistic Regression to categorize news content as either genuine or false. This approach minimizes reliance on manual verification and offers an automated, scalable, and time-efficient mechanism for misinformation detection.
Experimental findings indicate that feature engineering methods, including TF-IDF representations and n-gram analysis, contribute substantially to improved classification performance. Incorporating source credibility assessment further strengthens prediction dependability. Additionally, visualization and evaluation components help interpret misinformation trends and assess overall model effectiveness.
References
[1] “Fake News Detection on Social Media: A Data Mining Perspective” – Shu, K., Sliva, A., Wang, S., Tang, J., Liu, H. In: ACM SIGKDD Explorations Newsletter, Vol. 19, No. 1, pp. 22–36 (2017).
[2] “Automatic Deception Detection: Methods for Finding Fake News” – Conroy, N.J., Rubin, V.L., Chen, Y. In: Proceedings of the Association for Information Science and Technology, Vol. 52, No. 1, pp. 1–4 (2015).
[3] “Detecting Opinion Spams and Fake News Using Text Classification” – Ahmed, H., Traore, I., Saad, S. In: Security and Privacy, Vol. 1, No. 1, pp. 1–8 (2018).
[4] “Information Credibility on Twitter” – Castillo, C., Mendoza, M., Poblete, B. In: Proceedings of the International World Wide Web Conference, pp. 675–684 (2011).
[5] “The Spread of True and False News Online” – Vosoughi, S., Roy, D., Aral, S. In: Science, Vol. 359, No. 6380, pp. 1146–1151 (2018).
[6] “Fake News Detection Using Deep Neural Networks” – Kaliyar, R.K., Goswami, A., Narang, P. In: Procedia Computer Science, Vol. 132, pp. 106–113 (2018).
[7] “R: A Language and Environment for Statistical Computing” – R Core Team. R Foundation for Statistical Computing, Vienna, Austria (2023).
[8] “Speech and Language Processing (3rd Edition)” – Jurafsky, D., Martin, J.H. Pearson (2021).
[9] “An Introduction to Statistical Learning” – James, G., Witten, D., Hastie, T., Tibshirani, R. Springer (2013).
[10] “Scikit-learn: Machine Learning in Python – Documentation” – Scikit-learn Developers. Available at: https://scikit-learn.org