Exploring the Power and Practical Applications of K-Nearest Neighbours (KNN) in Machine Learning
DOI:
https://doi.org/10.69996/jcai.2024002Keywords:
Machine learning, k-nearest neighbours, artificial intelligence, knn, cuting edge fieldAbstract
Artificial intelligence’s main component, machine learning, enables systems to learn on their own and improve performance via experience, doing away with the need for explicit programming. This cutting-edge field focuses on equipping computer programs with the ability to access vast datasets and derive intelligent decisions from them. One of the cornerstone algorithms in machine learning, the K-nearest neighbours (KNN) algorithm, is known for its simplicity and effectiveness. KNN leverages the principle of storing all available data points within its training dataset and subsequently classifying new, unclassified cases based on their similarity to the existing dataset. This proximity-based classification approach renders KNN a versatile and intuitive tool with applications spanning diverse domains. This document explores the inner workings of the K-nearest neighbours’ algorithm, its practical applications across various domains, and a comprehensive examination of its strengths and limitations. Additionally, it offers insights into practical considerations and best practices for the effective implementation of KNN, illuminating its significance in the continually evolving landscape of machine learning and artificial intelligence.
References
[1] A. Beygelzimer, K. Sham, L. John, Sunil Arya, David Mount et al., “FNN: Fast Nearest Neighbor Search Algorithms and Applications,” https://CRAN.R-project.org/package=FNN.Bruce, Peter, and Andrew Bruce. 2017. Practical Statistics for Data Scientists: 50 Essential Concepts. O’Reilly Media,Inc.pp.1-17, 2017.
[2] P. Cunningham and Sarah Jane Delany. “K-Nearest Neighbour Classifiers.” Multiple Classifier Systems, vol. 34, no.8, pp. 1-17.
[3] De. Maesschalck, Roy, Delphine Jouan-Rimbaud, and Désiré L Massart. “The Mahalanobis Distance.” Chemometrics and Intelligent Laboratory Systems, vol.50, no.1, pp. 1–18. 2000.
[4] Friedman, Jerome, Trevor Hastie, and Robert Tibshirani. “The Elements of Statistical Learning. Vol.1. Springer Series in Statistics New York,” NY, USA:Han, Jiawei, Jian Pei, and Micheline Kamber. 2011. Data Mining: Concepts and Techniques. Elsevier.
[5] Jiang, Shengyi, P. Guansong, Wu Meilin and Limin Kuang. “An Improved K-Nearest-Neighbor Algorithm for Text Categorization.” Expert Systems with Applications, vol. 39, no.1,2017.
[6] Mccord, Michael, and M Chuah. “Spam Detection on Twitter Using Traditional Classifiers.” In International Conference on Autonomic and Trusted Computing, pp.175–86.2011.
[7] Robinson, John T. “The Kdb-Tree: A Search Structure for Large Multidimensional Dynamic Indexes.” In Proceedings of the 1981 Acm Sigmod International Conference on Management of Data, pp. 10–18. ACM.
[8] Kubat M and Matwin S. “Addressing the curse of imbalanced training sets: one-sided selection,” ICML, 1997, pp. 179-186.
[9] C. Wang, L. Hu, M. Guo, X. Liu and Q. Zou et al. “an ensemble learning method for imbalanced classification with miRNA data,” Genetics and molecular research. vol. 14, pp.123, 2015.
[10] NB. Abdel-Hamid, S. ElGhamrawy, AE. Desouky and H. Arafat. “A Dynamic Spark-based Classification Framework for Imbalanced Big Data,’ J Grid Computing, vol. 16, no.607.2017.
Downloads
Published
Issue
Section
License
Copyright (c) 2024 Journal of Computer Allied Intelligence(JCAI)
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Fringe Global Scientific Press publishes all the papers under a Creative Commons Attribution-Non-Commercial 4.0 International (CC BY-NC 4.0) (https://creativecommons.org/licenses/by-nc/4.0/) license. Authors have the liberty to replicate and distribute their work. Authors have the ability to use either the whole or a portion of their piece in compilations or other publications that include their own work. Please see the licensing terms for more information on reusing the work.