Skip to content

paulocesarcsdev/machine_learning_from_scratch_python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 

Repository files navigation

Classification

  • KNN Algorithm - Finding Nearest Neighbors

    K-nearest neighbors (KNN) algorithm is a type of supervised ML algorithm which can be used for both classification as well as regression predictive problems. However, it is mainly used for classification predictive problems in industry. The following two properties would define KNN well.

    Lazy learning algorithm − KNN is a lazy learning algorithm because it does not have a specialized training phase and uses all the data for training while classification.

    Non-parametric learning algorithm − KNN is also a non-parametric learning algorithm because it doesn’t assume anything about the underlying data.

    Applications of KNN

    The following are some of the areas in which KNN can be applied successfully

    Banking System KNN can be used in banking system to predict weather an individual is fit for loan approval? Does that individual have the characteristics similar to the defaulters one?

    Calculating Credit Ratings KNN algorithms can be used to find an individual’s credit rating by comparing with the persons having similar traits.

    Politics With the help of KNN algorithms, we can classify a potential voter into various classes like “Will Vote”, “Will not Vote”, “Will Vote to Party ‘Congress’, “Will Vote to Party ‘BJP’.

    Other areas in which KNN algorithm can be used are Speech Recognition, Handwriting Detection, Image Recognition and Video Recognition.

Pros and Cons of KNN

Pros

It is very simple algorithm to understand and interpret.

It is very useful for nonlinear data because there is no assumption about data in this algorithm.

It is a versatile algorithm as we can use it for classification as well as regression.

It has relatively high accuracy but there are much better supervised learning models than KNN.

Cons

It is computationally a bit expensive algorithm because it stores all the training data.

High memory storage required as compared to other supervised learning algorithms.

Prediction is slow in case of big N.

It is very sensitive to the scale of data as well as irrelevant features.

Euclidean distance

formula

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages