Overview of K-nearest Neighbor Algorithm

Published: 2021-09-15 16:15:08
essay essay

Category: Computer Science, Technology

Type of paper: Essay

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Hey! We can write a custom essay for you.

All possible types of assignments. Written by academics

In pattern recognition, the K-nearest neighbor algorithm (K-NN) is a non-parametric method used for classification and regression. In both cases, the input consists of the k closest training examples in the feature space.
The output depends on whether K-NN used for classification or regression:
In K-NN classification: The output is a class membership. The object is classified by majority vote of its neighbors. With the object being assigned to the class most common among its K nearest neighbors, (K is a positive integer, typically small). If K= 1, then the object is simply assigned to the class of that single nearest neighbor. In K-NN regression: the output is the property value for the object. This value is the average of the values of its K nearest neighbors.
K nearest neighbors is a simple algorithm that stores all available cases and classifies new cases based on a similarity measure .K-NN has been used in statistical estimation and pattern recognition as a non-parametric technique.
K-NN is a type of machine learning technique or you call it (lazy machine) to use algorithm , where the function is only approximated locally and all computation is deferred until classification. The K-NN algorithm is among the simplest of all learning algorithms. The neighbors are taken from a set of objects for which the class (for K-NN classification) or the object property value (for K-NN regression) is known. This can be thought of as the training set for the algorithm, though no explicit training step is required.
In the classification phase, K is a user-defined constant, and an unlabeled vector (a query or test point) is classified by assigning the label which is most frequent among the K training samples nearest to that query point.
A commonly used distance metric for continuous variables is Euclidean distance. For discrete variables, such as for text classification, another metric can be used, such as the overlap metric (or Hamming distance):
The best choice of K depends upon the data; generally, larger values of K reduce the effect of noise on the classification, but make boundaries between classes less distinct. A good K can be selected by various heuristic techniques. The special case where the class is predicted to be the class of the closest training sample when (K = 1) and is called the nearest neighbor algorithm. In K-NN regression, the K-NN algorithm is used for estimating continuous variables.

Warning! This essay is not original. Get 100% unique essay within 45 seconds!


We can write your paper just for 11.99$

i want to copy...

This essay has been submitted by a student and contain not unique content

People also read