Introduction
String similarity algorithms measure how similar two strings are. They’re essential for fuzzy matching, spell checking, search suggestions, and data deduplication.
Levenshtein Distance
Edit distance - minimum edits needed to transform one string into another.
Operations: Insert, delete, substitute
Example:
"kitten" → "sitting"
- k → s (substitute)
- i → i (same)
- t → t (same)
- t → t (same)
- e → i (substitute)
- n → n (same)
+ g (insert)
Distance: 3
Implementation
function levenshteinDistance(str1, str2) {
const m = str1.length;
const n = str2.length;
const dp = Array(m + 1)
.fill(null)
.map(() => Array(n + 1).fill(0));
for (let i = 0; i <= m; i++) dp[i][0] = i;
for (let j = 0; j <= n; j++) dp[0][j] = j;
for (let i = 1; i <= m; i++) {
for (let j = 1; j <= n; j++) {
if (str1[i - 1] === str2[j - 1]) {
dp[i][j] = dp[i - 1][j - 1];
} else {
dp[i][j] =
1 +
Math.min(
dp[i - 1][j], // deletion
dp[i][j - 1], // insertion
dp[i - 1][j - 1] // substitution
);
}
}
}
return dp[m][n];
}
Similarity Metrics
Normalized Similarity
function similarity(str1, str2) {
const maxLen = Math.max(str1.length, str2.length);
if (maxLen === 0) return 1.0;
const distance = levenshteinDistance(str1, str2);
return 1 - distance / maxLen;
}
similarity("kitten", "sitting"); // ~0.57 (57% similar)
Tools
Use our tools:
Conclusion
String similarity algorithms enable:
Use cases:
- Fuzzy search
- Spell checking
- Data deduplication
- Autocomplete
- Record matching
Common algorithms:
- Levenshtein distance
- Jaro-Winkler
- Hamming distance
- Cosine similarity
Next Steps
- Calculate with String Similarity
- Try Levenshtein Distance
- Learn Algorithms