Introduction

String similarity algorithms measure how similar two strings are. They’re essential for fuzzy matching, spell checking, search suggestions, and data deduplication.

Levenshtein Distance

Edit distance - minimum edits needed to transform one string into another.

Operations: Insert, delete, substitute

Example:

"kitten" → "sitting"
- k → s (substitute)
- i → i (same)
- t → t (same)
- t → t (same)
- e → i (substitute)
- n → n (same)
+ g (insert)

Distance: 3

Implementation

function levenshteinDistance(str1, str2) {
  const m = str1.length;
  const n = str2.length;
  const dp = Array(m + 1)
    .fill(null)
    .map(() => Array(n + 1).fill(0));

  for (let i = 0; i <= m; i++) dp[i][0] = i;
  for (let j = 0; j <= n; j++) dp[0][j] = j;

  for (let i = 1; i <= m; i++) {
    for (let j = 1; j <= n; j++) {
      if (str1[i - 1] === str2[j - 1]) {
        dp[i][j] = dp[i - 1][j - 1];
      } else {
        dp[i][j] =
          1 +
          Math.min(
            dp[i - 1][j], // deletion
            dp[i][j - 1], // insertion
            dp[i - 1][j - 1] // substitution
          );
      }
    }
  }

  return dp[m][n];
}

Similarity Metrics

Normalized Similarity

function similarity(str1, str2) {
  const maxLen = Math.max(str1.length, str2.length);
  if (maxLen === 0) return 1.0;
  const distance = levenshteinDistance(str1, str2);
  return 1 - distance / maxLen;
}

similarity("kitten", "sitting"); // ~0.57 (57% similar)

Tools

Use our tools:

Conclusion

String similarity algorithms enable:

Use cases:

  • Fuzzy search
  • Spell checking
  • Data deduplication
  • Autocomplete
  • Record matching

Common algorithms:

  • Levenshtein distance
  • Jaro-Winkler
  • Hamming distance
  • Cosine similarity

Next Steps