Research on SQL Injection Attacks using Word Embedding Techniques and Machine Learning

S. Venkatramulu; Md. Sharfuddin Waseem; Arshiya Taneem; Sri Yashaswini Thoutam; Snigdha Apuri; Nachiketh

doi:10.69996/jsihs.2024005

Authors

S. Venkatramulu Associate Professor, Department of Computer Science and Engineering, KITSW (Affiliated to Kakatiya University), Warangal, Telangana506015, India
Md. Sharfuddin Waseem Assistant Professor, Department of Computer Science and Engineering, KITSW (Affiliated to Kakatiya University), Warangal, Telangana506015, India.
Arshiya Taneem Students, Department of Computer Science and Engineering, KITSW (Affiliated to Kakatiya University), Warangal, India -506015
Sri Yashaswini Thoutam Students, Department of Computer Science and Engineering, KITSW (Affiliated to Kakatiya University), Warangal, India -506015
Snigdha Apuri Students, Department of Computer Science and Engineering, KITSW (Affiliated to Kakatiya University), Warangal, India -506015
Nachiketh Students, Department of Computer Science and Engineering, KITSW (Affiliated to Kakatiya University), Warangal, India -506015

DOI:

https://doi.org/10.69996/jsihs.2024005

Keywords:

SQL injection, machine learning, word embedding techniques, svm, logistic regression, xgboost

Abstract

Most of the damage done by web application attacks comes from SQL injection attacks, in which the attacker(s) can change, remove, and read data from the database servers. All three tenets of security—confidentiality, integrity, and availability—are vulnerable to a SQL injection attack. Database management systems receive their queries in the form of SQL (structured query language). It is not a new field of study, but it is still important to detect and prevent SQL injection attacks. A method of SQL injection detection based on machine learning is proposed. Feature extraction, followed by implementing various word embedding techniques like count vectorizer, TFIDF vectorizer to process the text data which can effectively represent the SQLI features is performed. Classification algorithms like Logistic Regression, SVM and Ensemble techniques like XGBoost is employed. Our goal in doing this systematic review is to find a better machine learning model to detect SQL injection attacks via implementing different word embedding techniques. The accuracy and F1-score of machine learning algorithms in terms of predicting the SQLI query has been calculated and reported in this research paper.

References

[1] S.S. Anandha Krishnan1, Adhil N Sabu, Priya P Sajan, A.L. Sreedeep “SQL Injection Detection Using Machine Learning” Revista Geintec, 2021.

[2] K. Vijay Kumar, S. Sravanthi, Syed Shujauddin Sameer and K. Anil Kumar, “Effective Data Aggregation Model for the Healthcare Data Transmission and Security in Wireless Sensor Network Environment,” Journal of Sensors, IoT & Health Sciences, vol.1, no.1, pp.40-50, 2020.

[3] B.M.Ajose-Ismail, O.V. Abimbola and S.A. Oloruntoba, “Performance Analysis of Different Word Embedding Models for Text Classification,” International Journal of Scientific Research and Engineering Development, vol. 3, no.6, 2020.

[4] S. Kasetti and S.Korra, “Multimedia Data Transmission with Secure Routing in M-IOT-based Data Transmission using Deep Learning Architecture,” Journal of Computer Allied Intelligence, vol.1, no.1, pp.1-13, 2023.

[5] Binh Ahn Pham and Vinitha Hannah Subburaj “An Experimental Setup for Detecting SQLi Attacks using Machine Learning Algorithms,” Journal of The Colloquium for Information Systems Security Education, vol.8, no. 1, 2020.

[6] Anamika Joshi and V. Geetha, “SQL Injection Detection using Machine Learning,” International Conference on Control, Instrumentation, Communication and Computational Technologies (ICCICCT), Kanyakumari, India, 2014.

[7] Kevin Ross “SQL Injection Detection Using Machine Learning Techniques and Multiple Data Sources,” 2019 Masters Thesis, San Jose State University.

[8] The Open Web Application Security Project (OW ASP). The Ten Most Critical Web Application Security Risks 2010. https://www.owasp.org/index.php/Top_10_2013

[9] Umar Farooq “Ensemble Machine Learning Approaches for Detection of SQL Injection Attacks,” Technical Journal , vol.15, 2021.

[10] Tareek Pattewar, Hitesh Patil, Harshada Patil, Neha Patil, Muskan Taneja et al., “Detection of SQL Injection using Machine Learning: A Survey,” International Research Journal of Engineering and Technology, vol.6, no.11, 2019.

[11] Sonali Mishra “SQL Injection Detection Using Machine Learning” 2019 Master’s Thesis San Jose State University.

[12] Gradient Boosting https://bradleyboehmke.github.io/HOML/gbm.html

[13] Understanding Support Vector Machine (SVM) algorithm from exampleshttps://www.analyticsvidhya.com/blog/2017/09/understaing-support-vector-machine-example-code/

[14] J. Brownlee, Start With Gradient Boosting, Results from Comparing 13 Algorithms on 165 Datasets”, March 30,2018. [Online]Available: https://machinelearningmastery.com/start-withgradient-boosting/

[15] Zhuang Chen, Min Guo and Lin zhou. “Research on SQL injection detection technology based on SVM,” MATEC Web of Conferences, vol. 173, no.01004, 2018.

[16] S. Mumtaz, C. Rodriguez, B. Benatallah, M. Al-Banna and S. Zamanirad, “Learning Word Representation for the Cyber Security Vulnerability Domain,” The 2020 International Joint Conference on Neural Networks - IJCNN 2020, Glasgow, UK, 2020.

Research on SQL Injection Attacks using Word Embedding Techniques and Machine Learning

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite

Make a Submission

Submission via Email

Impact Factor

Keywords