e r l a n g : c o o k b o o k

/ Erlang.StringRecipeLevenshtein

This Web


WebHome 
WebChanges 
TOC
NewRecipe 
WebTopicList 
WebStatistics 

All Webs


Chicken
Cookbook
Erlang
Know
Main
Plugins
Sandbox
Scm
TWiki  

Erlang Links


Erlang.org
Erlang Wiki
ErlMan
Erlang Wiki
The Jungerl
Erlang-fr.org
Joe Armstrong
Lambda the Ultimate

Erlang Web Ring


[Prev]: Joe Armstrong's Page
[Next]: Joe Armstrong's Page

Testing Similarity of Two Strings via Levenshtein Distance

Problem

You want to determine how "similar" two strings (or lists, or vectors) are, such as for spelling correction or pattern-matching.

Solution

Note: This library does not exist yet. Scheme data shown for the time being:

One metric of string similarity is the Levenshtein Distance, or edit distance, which is a count of character insertions, deletions, and substitutions that will transform one string to another. To get the Levenshtein Distance of two strings, use the levenshtein.scm library (available from http://www.neilvandyke.org/levenshtein-scm/) and load it like:

(require (lib "levenshtein.ss" "levenshtein"))

Then you can use the string-levenshtein procedure:

> (string-levenshtein "adresse" "address")
2

You can also get the Levenshtein Distance of combinations of strings, lists, and vectors. For example:

> (levenshtein '#(#\a #\d #\r #\e #\s #\s #\e) "address")
2

See the levenshtein.scm documentation for additional procedures that can be used to improve performance in some applications.

Discussion


Comments about this recipe

Contributors

-- NeilVanDyke - 16 May 2004

Discussion

This doesn't apply to Erlang, and is only here as a placeholder until the library is implemented. Coming to a Jungerl near you...

-- BrentAFulgham - 23 Aug 2004

CookbookForm
TopicType: Recipe
ParentTopic: StringRecipes
TopicOrder: 240

 
 
Copyright © 2004 by the contributing authors. All material on the Erlang Cookbook web site is the property of the contributing authors.
This material can be redistributed and/or modified under the terms of the GNU Lesser General Public License (LGPL), version 2.1, as published by the Free Software Foundation.
Ideas, requests, problems regarding Schematics Cookbook? Send feedback.
/ You are Main.guest