Detection of Machine-Generated Tweets through a Hybrid CNN-LSTM Deep Learning Framework | IJCSE Volume 9 – Issue 6 | IJCSE-V9I6P20

IJCSE International Journal of Computer Science Engineering Logo

International Journal of Computer Science Engineering Techniques

ISSN: 2455-135X
Volume 9, Issue 6  |  Published:
Author

Abstract

The recent breakthrough in the natural language production can be considered another weapon to control the masses opinion in the social media. Moreover, language modelling has substantially enhanced the generative capabilities of deep neural models, giving them more content generation skills. Therefore, text-generative models have developed into more powerful ones that provide the opponents an opportunity to employ these incredible skills to empower social bots, enabling them to produce authentic deep fake posts and manipulate the conversation of the masses. In order to solve this issue, it is significant to develop effective and precise deepfake social media message detecting algorithms. In this light, the modern day studies entail the discovery of the machine generated text in social networks such as Twitter. This project uses a simple deep learning model along with word embeddings to classify the tweets as either human generated or bot-generated tweets based on a publicly available dataset made up of Tweep fake. A standard Convolutional Neural Network (CNN) based on a Long ShortTerm Memory(LSTM) network is developed, to execute the role of detecting deepfake tweets. In order to demonstrate the high quality of the offered method, this project used a number of machine learning models as the baseline methods to compare them. Additionally, the effectiveness of the suggested approach is also contrasted with other deep learning networks like Long short-term memory (LSTM) that demonstrates the efficiency and outlines the benefits of the approach in coping with the given task correctly.

Keywords

CNN, LSTM, tweets, message detecting, deep fake

Conclusion

Deepfake text detection is a critical and challenging task in the era of misinformation and manipulated content. This project aimed to address this challenge by proposing an approach for deepfake text detection and evaluating its effectiveness. A dataset containing tweets of bots and humans is used for analysis by applying several machine learning and deep learning models along with feature engineering techniques. Well-known feature extraction techniques: TF and TF-IDF and word embedding techniques: Fast Text and Fast Text sub words are used. By leveraging a combination of techniques such as CNN-LSTM, the proposed approach demonstrated promising results with a 0.96 accuracy score in accurately identifying deepfake text. Furthermore, the results of the proposed approach is compared with other state-of-the-art transfer learning models from previous literature. Overall, the adoption of a CNN-LSTM model structure in this project shows its superiority in terms of simplicity, computational efficiency, and handling out-of-vocabulary terms. These advantages make the proposed approach a compelling option for text detection tasks, demonstrating that sophisticated performance can be achieved without the need for complex and time-consuming transfer learning models. The findings of this project contribute to advancing the field of deepfake detection and provide valuable insights for future research and practical applications. As social media continues to play a significant role in shaping public opinion, the development of robust deepfake text detection techniques is imperative to safeguard genuine information and preserve the integrity of democratic processes. In future research, the quantum NLP and other cutting-edge methodologies will be applied for more sophisticated and efficient detection systems, to fight against the spread of misinformation and deceptive content on social media platform.

References

[1]Biyang Guo, Xin Zhang, Ziyuan Wang, Minqi Jiang, Jinran Nie, Yuxuan Ding, Jianwei Yue, and Yupeng Wu. How close is chatgpt to human experts? comparison corpus, evaluation, and detection. arXiv preprint arXiv:2301.07597, 2023. [2]Rexhep Shijaku and Ercan Canhasi. Chatgpt generated text detection. [3]Nick Hajli, Usman Saeed, Mina Tajvidi, and Farid Shirazi. Social bots and the spread of disinformation in social media: the challenges of artificial intelligence. British Journal of Management, 33(3):1238–1253, 2022. [4]David Dukic, Dominik Ke ´ ca, and Dominik Stipi ˇ c. Are you human? ´ detecting bots on twitter using bert. In 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA), pages 631– 636. IEEE, 2020. [5]Sneha Kudugunta and Emilio Ferrara. Deep neural networks for bot detection. Information Sciences, 467:312–322, 2018. [6]Yongqiang Ma, Jiawei Liu, and Fan Yi. Is this abstract generated by ai? a research for the gap between ai-generated scientific text and humanwritten scientific text. arXiv preprint arXiv:2301.10416, 2023. [7]Sandra Mitrovic, Davide Andreoletti, and Omran Ayoub. Chatgpt ´ or human? detect and explain. explaining decisions of machine learning model for detecting short chatgpt-generated text. arXiv preprint arXiv:2301.13852, 2023. [8]Maryam Heidari and James H Jones Jr. Bert model for social media bot detection. 2022. [9]Daphne Ippolito, Daniel Duckworth, Chris Callison-Burch, and Douglas Eck. Automatic detection of generated text is easiest when humans are fooled. arXiv preprint arXiv:1911.00650, 2019. [10]Leo Breiman. Random forests. Machine learning, 45:5–32, 2001. [11]Hailun Xie, Li Zhang, and Chee Peng Lim. Evolving cnn-lstm models for time series prediction using enhanced grey wolf optimizer. IEEE access, 8:161519–161541, 2020. [12]Hailun Xie, Li Zhang, and Chee Peng Lim. Evolving cnn-lstm models for time series prediction using enhanced grey wolf optimizer. IEEE access, 8:161519–161541, 2020. [13]S Selva Birunda and R Kanniga Devi. A review on word embedding techniques for text classification. Innovative Data Communication Technologies and Application: Proceedings of ICIDCA 2020, pages 267–281, 2021. Maryam Heidari and James H Jones Jr. Bert model for social media bot detection. 2022.
Š 2025 International Journal of Computer Science Engineering Techniques (IJCSE).
Submit Paper