Research on SQL Injection Attacks using Word Embedding Techniques and Machine Learning
DOI:
https://doi.org/10.69996/jsihs.2024005Keywords:
SQL injection, machine learning, word embedding techniques, svm, logistic regression, xgboostAbstract
Most of the damage done by web application attacks comes from SQL injection attacks, in which the attacker(s) can change, remove, and read data from the database servers. All three tenets of security—confidentiality, integrity, and availability—are vulnerable to a SQL injection attack. Database management systems receive their queries in the form of SQL (structured query language). It is not a new field of study, but it is still important to detect and prevent SQL injection attacks. A method of SQL injection detection based on machine learning is proposed. Feature extraction, followed by implementing various word embedding techniques like count vectorizer, TFIDF vectorizer to process the text data which can effectively represent the SQLI features is performed. Classification algorithms like Logistic Regression, SVM and Ensemble techniques like XGBoost is employed. Our goal in doing this systematic review is to find a better machine learning model to detect SQL injection attacks via implementing different word embedding techniques. The accuracy and F1-score of machine learning algorithms in terms of predicting the SQLI query has been calculated and reported in this research paper.
References
[1] S.S. Anandha Krishnan1, Adhil N Sabu, Priya P Sajan, A.L. Sreedeep “SQL Injection Detection Using Machine Learning” Revista Geintec, 2021.
[2] Neha Kulkarni, Dr. Ravindra Vaidya and Dr. Manasi Bhate “A comparative study of Word Embedding Techniques to extract features from Text,” Turkish Journal of Computers and Mathematics, 2021.
[3] B.M.Ajose-Ismail, O.V. Abimbola and S.A. Oloruntoba, “Performance Analysis of Different Word Embedding Models for Text Classification,” International Journal of Scientific Research and Engineering Development, vol. 3, no.6, 2020.
[4] P.Ravi , K.Sai Prasad, M. Lasya and N.Ram Akhilesh, “Detection of Web Attacks using Ensemble Learning,” International Research Journal of Engineering and Technology, vol. 8, no.7, 2021.
[5] Binh Ahn Pham and Vinitha Hannah Subburaj “An Experimental Setup for Detecting SQLi Attacks using Machine Learning Algorithms,” Journal of The Colloquium for Information Systems Security Education, vol.8, no. 1, 2020.
[6] Anamika Joshi and V. Geetha, “SQL Injection Detection using Machine Learning,” International Conference on Control, Instrumentation, Communication and Computational Technologies (ICCICCT), Kanyakumari, India, 2014.
[7] Kevin Ross “SQL Injection Detection Using Machine Learning Techniques and Multiple Data Sources,” 2019 Masters Thesis, San Jose State University.
[8] The Open Web Application Security Project (OW ASP). The Ten Most Critical Web Application Security Risks 2010. https://www.owasp.org/index.php/Top_10_2013
[9] Umar Farooq “Ensemble Machine Learning Approaches for Detection of SQL Injection Attacks,” Technical Journal, vol.15, 2021.
[10] Tareek Pattewar, Hitesh Patil, Harshada Patil, Neha Patil, Muskan Taneja et al., “Detection of SQL Injection using Machine Learning: A Survey,” International Research Journal of Engineering and Technology, vol.6, no.11, 2019.
[11] Sonali Mishra “SQL Injection Detection Using Machine Learning” 2019 Master’s Thesis San Jose State University.
[12] Gradient Boosting https://bradleyboehmke.github.io/HOML/gbm.html
[13] Understanding Support Vector Machine (SVM) algorithm from exampleshttps://www.analyticsvidhya.com/blog/2017/09/understaing-support-vector-machine-example-code/
[14] J. Brownlee, ‘Start With Gradient Boosting, Results from Comparing 13 Algorithms on 165 Datasets’, March 30,2018.[Online]Available: https://machinelearningmastery.com/start-withgradient-boosting/
[15] Zhuang Chen, Min Guo and Lin zhou. “Research on SQL injection detection technology based on SVM,” MATEC Web of Conferences, vol. 173, no.01004,2018.
[16] S. Mumtaz, C. Rodriguez, B. Benatallah, M. Al-Banna and S. Zamanirad. “Learning Word Representation for the Cyber Security Vulnerability Domain.” The 2020 International Joint Conference on Neural Networks - IJCNN 2020, Glasgow, UK, 2020.
Downloads
Published
Issue
Section
License
Copyright (c) 2024 Journal of Sensors, IoT & Health Sciences (JSIHS,ISSN: 2584-2560)
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Fringe Global Scientific Press publishes all the papers under a Creative Commons Attribution-Non-Commercial 4.0 International (CC BY-NC 4.0) (https://creativecommons.org/licenses/by-nc/4.0/) license. Authors have the liberty to replicate and distribute their work. Authors have the ability to use either the whole or a portion of their piece in compilations or other publications that include their own work. Please see the licensing terms for more information on reusing the work.