Product

What Is Fuzzy Search and the Logic Behind It?

Akshaya Balasubramaniyan

Content Lead, Keyspider

January 2024

7 min read

In the vast landscape of the internet, where users navigate through an abundance of information, the key to standing out lies in providing an unparalleled user experience. Imagine a scenario where your website not only understands but anticipates user intent, making their journey seamless and enjoyable. This is where fuzzy search comes into play. In our digital age, users expect search to be forgiving of their imperfections, and a single 'no results found' page because of a typo can send them straight to a competitor.

Fuzzy search is a technique that finds results that are approximately, rather than exactly, matching a given query. It handles typos, spelling variations, and approximate matches with a degree of confidence, ranking results by how closely they match the intent of the query rather than demanding a character-perfect match. Understanding how this works, and why it matters, is essential for anyone responsible for website search quality.

Understanding Fuzzy Logic

Fuzzy logic may sound complex, but it is surprisingly intuitive when broken down. Traditional binary logic works in absolutes: something either matches or it does not. Fuzzy logic instead works with degrees of truth, expressed on a scale from 0 to 1. A search result might be 0.9 relevant, or 0.4 relevant, rather than simply relevant or not relevant.

Consider a simple example. A user searches for 'sunn' instead of 'sunny'. Binary logic returns nothing, because 'sunn' does not exactly match any word in the index. Fuzzy logic recognises that 'sunn' is very close to 'sunny', scoring it at perhaps 0.85 relevance, and returns results about sunshine, sunny days, and related content. The user gets what they need despite the incomplete query.

This principle extends to more complex matching: transposed letters ('hte' for 'the'), substituted characters ('recieve' for 'receive'), missing characters ('univrsity' for 'university'), and extra characters ('productss' for 'products'). Fuzzy search handles all of these through algorithms that calculate the minimum number of single-character edits needed to transform one string into another, a measure known as edit distance or Levenshtein distance.

How Fuzzy Search Works

When a user enters a query, the fuzzy search algorithm calculates the edit distance between the query and every candidate in the index. Results are scored by how many edits would be required to turn the query into the indexed term. A result requiring zero edits is an exact match; a result requiring one edit (for example, adding a missing letter) scores very highly; results requiring more edits score progressively lower.

Most production fuzzy search implementations set a maximum edit distance threshold, typically two for short queries and three for longer ones. Results beyond this threshold are excluded entirely. The system also accounts for the relative position of errors: an error at the beginning of a word is weighted differently than an error at the end, because users are more likely to mistype the end of a long word than its beginning.

Levenshtein distance in plain terms

Levenshtein distance measures the minimum number of insertions, deletions, or substitutions needed to transform one string into another. 'kitten' to 'sitting' requires three operations: substitute 'k' with 's', substitute 'e' with 'i', and insert 'g' at the end. The Levenshtein distance is 3. Fuzzy search uses this calculation to determine how closely a query matches indexed content.

Enhancing Website User Experience with Fuzzy Search

The most direct benefit of fuzzy search is the elimination of unnecessary zero-results pages. Every zero-results page represents a user who came looking for something and left without finding it. On many websites, a significant proportion of zero-results pages are caused by nothing more serious than a typo. Fuzzy search turns those failures into successful searches.

Fuzzy search also reduces the cognitive burden on users. Knowing that a search engine is forgiving encourages users to use it more freely, searching with confidence rather than carefully checking their spelling before pressing enter. This increased engagement with search is generally positive: users who use site search are significantly more likely to convert than users who navigate passively.

Handles common typos and misspellings without requiring users to self-correct
Supports users on mobile devices where typing accuracy is lower
Accommodates non-native speakers and users with dyslexia or reading difficulties
Reduces 'no results' rates, which are a leading cause of website abandonment
Works alongside synonym expansion and semantic search to create a multi-layered tolerance for query variation

Practical Applications for Websites

For e-commerce websites, fuzzy search directly protects revenue. A user searching for 'airpods' who accidentally types 'airlpods' should still find the relevant product page. The alternative, a zero-results page, likely results in the user leaving to find the product on a competitor's site that does handle the typo gracefully.

For government and public sector websites, fuzzy search combined with semantic search is particularly powerful. Citizens searching for services often struggle with official terminology, and they also make spelling errors. A fuzzy semantic search engine handles both the vocabulary gap (using semantic matching) and the precision gap (using fuzzy matching), maximising the chance that every search attempt leads to a useful result.

For internal knowledge bases and document libraries, fuzzy search helps employees find documents even when they only partially remember a title or cannot recall the exact spelling of a technical term. In domains with extensive jargon, where correct spelling is difficult even for experts, this is a meaningful productivity benefit.

Fuzzy Search and Semantic Search: Complementary Technologies

It is important to understand that fuzzy search and semantic search solve different problems. Fuzzy search handles the precision problem: the user knows what they want but has not typed it correctly. Semantic search handles the vocabulary problem: the user has typed their query correctly, but their words do not match the words used in the documents they are looking for.

Best-in-class search implementations combine both techniques. A query like 'how to cancle my acont' benefits from fuzzy matching (correcting the typos in 'cancle' and 'acont') and then from semantic matching (understanding that 'cancel my account' maps to content about account termination or subscription management, even if those exact words appear nowhere in the user's query).

Configuring Fuzzy Search Correctly

Fuzzy search requires careful calibration. Too permissive a threshold (allowing many edits) will return loosely related results that confuse rather than help users. Too restrictive a threshold defeats the purpose. Common best practice is to apply fuzzy matching only after an exact match attempt fails, and to weight exact matches heavily above fuzzy matches in ranking.

Additionally, short queries should use tighter fuzzy thresholds than long queries. A one-edit tolerance for a three-letter query like 'cat' would match almost every common short word in the language. The same one-edit tolerance applied to a fifteen-character query is much more conservative and appropriate.

Implementation note

Modern search platforms like Keyspider handle fuzzy matching configuration automatically, applying appropriate edit distance thresholds based on query length and using machine learning to improve fuzzy matching based on actual user behaviour. This removes the manual calibration burden that earlier fuzzy search implementations required.

Explore further

12 Must-Have Site Search FeaturesThe complete feature set for modern, high-performing site search.

AI Search vs. Keyword SearchHow semantic search closes the vocabulary gap that fuzzy search cannot address.

Why Your Website Site Search Should Be SemanticThe case for moving beyond keyword and fuzzy search to full semantic retrieval.

Keyspider AI Search Product OverviewSee how Keyspider combines fuzzy matching, synonym expansion, and semantic search.