Research on SQL Injection Attacks using Word Embedding Techniques and Machine Learning

Authors

  • S. Venkatramulu Associate Professor, Department of Computer Science and Engineering, KITSW (Affiliated to Kakatiya University), Warangal, Telangana506015, India Author
  • Md. Sharfuddin Waseem Assistant Professor, Department of Computer Science and Engineering, KITSW (Affiliated to Kakatiya University), Warangal, Telangana506015, India. Author
  • Arshiya Taneem Students, Department of Computer Science and Engineering, KITSW (Affiliated to Kakatiya University), Warangal, India -506015 Author
  • Sri Yashaswini Thoutam Students, Department of Computer Science and Engineering, KITSW (Affiliated to Kakatiya University), Warangal, India -506015 Author
  • Snigdha Apuri Students, Department of Computer Science and Engineering, KITSW (Affiliated to Kakatiya University), Warangal, India -506015 Author
  • Nachiketh Students, Department of Computer Science and Engineering, KITSW (Affiliated to Kakatiya University), Warangal, India -506015 Author

DOI:

https://doi.org/10.69996/jsihs.2024005

Keywords:

SQL injection, machine learning, word embedding techniques, svm, logistic regression, xgboost

Abstract

Most of the damage done by web application attacks comes from SQL injection attacks, in which the attacker(s) can change, remove, and read data from the database servers. All three tenets of security—confidentiality, integrity, and availability—are vulnerable to a SQL injection attack. Database management systems receive their queries in the form of SQL (structured query language). It is not a new field of study, but it is still important to detect and prevent SQL injection attacks. A method of SQL injection detection based on machine learning is proposed. Feature extraction, followed by implementing various word embedding techniques like count vectorizer, TFIDF vectorizer to process the text data which can effectively represent the SQLI features is performed. Classification algorithms like Logistic Regression, SVM and Ensemble techniques like XGBoost is employed. Our goal in doing this systematic review is to find a better machine learning model to detect SQL injection attacks via implementing different word embedding techniques. The accuracy and F1-score of machine learning algorithms in terms of predicting the SQLI query has been calculated and reported in this research paper.

References

[1] S.S. Anandha Krishnan1, Adhil N Sabu, Priya P Sajan, A.L. Sreedeep “SQL Injection Detection Using Machine Learning” Revista Geintec, 2021.

[2] Neha Kulkarni, Dr. Ravindra Vaidya and Dr. Manasi Bhate “A comparative study of Word Embedding Techniques to extract features from Text,” Turkish Journal of Computers and Mathematics, 2021.

[3] B.M.Ajose-Ismail, O.V. Abimbola and S.A. Oloruntoba, “Performance Analysis of Different Word Embedding Models for Text Classification,” International Journal of Scientific Research and Engineering Development, vol. 3, no.6, 2020.

[4] P.Ravi , K.Sai Prasad, M. Lasya and N.Ram Akhilesh, “Detection of Web Attacks using Ensemble Learning,” International Research Journal of Engineering and Technology, vol. 8, no.7, 2021.

[5] Binh Ahn Pham and Vinitha Hannah Subburaj “An Experimental Setup for Detecting SQLi Attacks using Machine Learning Algorithms,” Journal of The Colloquium for Information Systems Security Education, vol.8, no. 1, 2020.

[6] Anamika Joshi and V. Geetha, “SQL Injection Detection using Machine Learning,” International Conference on Control, Instrumentation, Communication and Computational Technologies (ICCICCT), Kanyakumari, India, 2014.

[7] Kevin Ross “SQL Injection Detection Using Machine Learning Techniques and Multiple Data Sources,” 2019 Masters Thesis, San Jose State University.

[8] The Open Web Application Security Project (OW ASP). The Ten Most Critical Web Application Security Risks 2010. https://www.owasp.org/index.php/Top_10_2013

[9] Umar Farooq “Ensemble Machine Learning Approaches for Detection of SQL Injection Attacks,” Technical Journal, vol.15, 2021.

[10] Tareek Pattewar, Hitesh Patil, Harshada Patil, Neha Patil, Muskan Taneja et al., “Detection of SQL Injection using Machine Learning: A Survey,” International Research Journal of Engineering and Technology, vol.6, no.11, 2019.

[11] Sonali Mishra “SQL Injection Detection Using Machine Learning” 2019 Master’s Thesis San Jose State University.

[12] Gradient Boosting https://bradleyboehmke.github.io/HOML/gbm.html

[13] Understanding Support Vector Machine (SVM) algorithm from exampleshttps://www.analyticsvidhya.com/blog/2017/09/understaing-support-vector-machine-example-code/

[14] J. Brownlee, ‘Start With Gradient Boosting, Results from Comparing 13 Algorithms on 165 Datasets’, March 30,2018.[Online]Available: https://machinelearningmastery.com/start-withgradient-boosting/

[15] Zhuang Chen, Min Guo and Lin zhou. “Research on SQL injection detection technology based on SVM,” MATEC Web of Conferences, vol. 173, no.01004,2018.

[16] S. Mumtaz, C. Rodriguez, B. Benatallah, M. Al-Banna and S. Zamanirad. “Learning Word Representation for the Cyber Security Vulnerability Domain.” The 2020 International Joint Conference on Neural Networks - IJCNN 2020, Glasgow, UK, 2020.

Downloads

Published

2024-03-31

How to Cite

S. Venkatramulu, Md. Sharfuddin Waseem, Arshiya Taneem, Sri Yashaswini Thoutam, Snigdha Apuri, & Nachiketh. (2024). Research on SQL Injection Attacks using Word Embedding Techniques and Machine Learning. Journal of Sensors, IoT & Health Sciences (JSIHS,ISSN: 2584-2560), 2(1), 55-64. https://doi.org/10.69996/jsihs.2024005