BK-Tree is a data structure used for spell checking based on the Levenshtein Distance between two words, which is basically the number of changes you need to make to a word to turn it into another word. 

BK-Trees, or Burkhard-Keller Trees are a tree-based data structure engineered for quickly finding near-matches to a string, for example, as used by a spelling checker, or when doing a 'fuzzy' search for a term. The aim is to return, for example, "seek" and "peek" if I search for "aeek". What makes BK-Trees so cool is that they take a problem which has no obvious solution besides brute-force search, and present a simple and elegant method for speeding up searches substantially. 

A BK-tree is a metric tree, adapted to discrete metric spaces. An arbitrary element a is selected as root node. The root node may have zero or more subtrees. The k-th subtree is recursively built of all elements b such that {\displaystyle d(a,b)=k} d(a,b)=k
 

Distances Example :

  •     LevenshteinDistance(cook, book) -> 1
  •     LevenshteinDistance(cook, books) -> 2
  •     LevenshteinDistance(what, water) -> 3

Usage:

  • BK-trees can be used for approximate string matching in a dictionary
  • Levenshtein distance – the distance metric commonly used when building a BK-tree

 

It forms a Metric Space. Put simply, a metric space is any relationship that adheres to three basic criteria:

  • d(x,y) = 0 <-> x = y (If the distance between x and y is 0, then x = y)

  • d(x,y) = d(y,x) (The distance from x to y is the same as the distance from y to x)

  • d(x,y) + d(y,z) >= d(x,z)