Safeguarding Social Media Users through Machine Learning–Powered Fake URL Detection Systems | IJCSE Volume 9 – Issue 6 | IJCSE-V9I6P24

IJCSE International Journal of Computer Science Engineering Logo

International Journal of Computer Science Engineering Techniques

ISSN: 2455-135X
Volume 9, Issue 6  |  Published:
Author

Abstract

Counterfeit web addresses have become a major instrument of disseminating false information on social media. The importance of identifying these counterfeit URLs is to ensure that the propagation of fake information is prevented as well as the reliability of social media networks. In our paper, we suggest a machine literacy-based solution to identify fake URLs in the social media. We trained and approximate several machine learning models such as decision trees, arbitrary timbers and support vector machines on the dataset of real and fake URLs gathered on social media websites. We have found that the arbitrary timber algorithm had the highest delicacy of 96.5 compared to the other algorithms. We also examined the usefulness of colour features comparable to the length of URLs, sphere name, and URL order, and implemented the sphere name and URL order to be the most didactic features in identifying fake URLs. Our solution offers a reliable and efficient outcome to ascertain counterfeit URLs in the social media that can be applied to curb the diffusion of fake data and the reliability of social media networks.

Keywords

Fake URL, Machine learning, Phishing, Spamming, Malware, Lexical Features, Cat Boost classifier, Gradient boosting classifier, SPM, Decision tree, Logistic Regression, Naive Bayes classifier, KNN.

Conclusion

The project on fake URL detection using machine learning algorithms has demonstrated the effectiveness of using machine learning techniques for detecting and preventing the spread of fake news and misinformation on social media platforms. The system utilizes various machine learning algorithms, such as decision trees, random forests, and neural networks, to analyze and detect fake URLs in real time. The system also employs metrics like accuracy and F1 score to evaluate the performance of the models. Overall, the proposed system plays a vital role in safeguarding individuals and communities against harmful content, protecting sensitive information from cyber-attacks, and promoting a secure online environment. In the future, further research can be conducted to explore other features, classifiers, and techniques that can be integrated into the system to enhance its performance and effectiveness in detecting fake URLs.

References

1.Gupta, B.B., Arachchilage, N.A.G., & Psannis, K.E. (2018). Defending against phishing attacks: taxonomy of methods, current issues and future directions. Telecommunication Systems, 67(2), 247–267. 2.Ma, J., Saul, L.K., Savage, S., & Voelker, G.M. (2009). Beyond blacklists: learning to detect malicious web sites from suspicious URLs. Proceedings of the 15th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 1245–1254. 3.Blum, A., Wardman, B., Solorio, T., & Warner, G. (2010). Lexical feature-based phishing URL detection using online learning. Proceedings of the 3rd ACM Workshop on Artificial Intelligence and Security, 54–60. 4.Mohammad, R.M., Thabtah, F., & McCluskey, L. (2019). Intelligent phishing detection based on machine learning algorithms. Expert Systems with Applications, 101, 182–197. 5.Verma, R., & Dyer, K. (2015). On the character of phishing URLs: Lexical and statistical features. Proceedings of the 5th ACM Conference on Data and Application Security and Privacy, 111–122. 6.Sahoo, D., Liu, C., & Hoi, S.C.H. (2017). Malicious URL detection using machine learning: A survey. arXiv preprint arXiv:1701.07179. 7.Bahnsen, A.C., Torroledo, I., Camacho, L.D., & Villegas, S. (2018). Classifying phishing URLs using recurrent neural networks. IEEE Access, 6, 9424–9430. 8.Patgiri, R., Ahmed, A., & Sudhakar, S. (2020). Hybrid feature engineering for phishing URL detection: A machine learning approach. Procedia Computer Science, 167, 2412–2423. 9.Jain, A., & Gupta, M. (2021). Detecting malicious URLs on Twitter using contextual and behavioral features. Journal of Information Security and Applications, 61, 102929. 10.Chiew, K.L., Yong, K.S.C., & Tan, C.L. (2019). A new hybrid ensemble feature selection framework for machine learning-based phishing detection systems. Information Sciences, 484, 153–166. 11.Al-Momani, A., Faris, H., & Jarrah, M. (2022). Real-time phishing URL detection using NLP and streaming analytics. Applied Computing and Informatics, 18(2), 122–133.
© 2025 International Journal of Computer Science Engineering Techniques (IJCSE).
Submit Paper