K-Nearest Neighbors (KNN) is a type of instance-based learning, or lazy learning, where the function is approximated locally and all computation is deferred until function evaluation. KNN has been used in statistical estimation and pattern recognition as a non-parametric technique.
Today, we'll look into how to implement KNN in R for classification, using the iris dataset.
First, we'll need to set up our environment. Make sure you have R and the 'class' library installed. If you haven't already installed the 'class' library, you can do so with the following command:
install.packages('class')
Now, we are ready to import the iris dataset and the required library.
library(class) data(iris)
In our dataset, let's shuffle and normalize the data before we use it for training our model.
set.seed(123) ind <- sample(2, nrow(iris), replace = TRUE, prob = c(0.67, 0.33)) iris.training <- iris[ind==1, 1:4] iris.test <- iris[ind==2, 1:4]
We can now implement KNN. The following block of code classifies the test data. We're setting K to 3, but feel free to try different values of K to see how they affect the result.
iris_pred <- knn(iris.training, iris.test, iris[ind==1,5], k=3)
Finally, we'll evaluate how well our model performs. We compare the predicted results with the actual category of the test data.
iris_actual <- iris[ind==2,5] mean(iris_pred==iris_actual)
The result will be the accuracy of our KNN model.
That's it! We've built and evaluated a K-Nearest Neighbors model for classification in R using the iris dataset. KNN is a simple yet powerful machine learning algorithm that harnesses the power of proximity of instances.