s c h e m a t i c s : c o o k b o o k

/ StringRecipes / Cookbook.StringSoundex

This Web


WebHome 
WebChanges 
TOC (with recipes)
NewRecipe 
WebTopicList 
WebStatistics 

Other Webs


Chicken
Cookbook
Erlang
Know
Main
Plugins
Sandbox
Scm
TWiki  

Schematics


Schematics Home
Sourceforge Page
SchemeWiki.org
Original Cookbook
RSS

Scheme Links


Schemers.org
Scheme FAQ
R5RS
SRFIs
Scheme Cross Reference
PLT Scheme SISC
Scheme48 SCM
MIT Scheme scsh
JScheme Kawa
Chicken Guile
Bigloo Tiny
Gambit LispMe
GaucheChez

Lambda the Ultimate
TWiki.org

Soundex

Problem

You want to generate Soundex hashes of surnames, for doing "sounds-like" indexing databases, or retrieving information from the US Census records and similar pre-existing databases.

Soundex is a string hash historically used by the US Census for indexing surnames by a function of what they "sound" like, rather than their precise spelling. Further general information on Soundex is available at http://www.archives.gov/research_room/genealogy/census/soundex.html.

Solution

Use the soundex.scm library by NeilVanDyke, which can be installed? via a .plt file available from http://www.neilvandyke.org/soundex-scm/.

Specify the PLT module form of soundex.scm with like:

(require (lib "soundex.ss" "soundex"))

(Most non-PLT Scheme implementations can use the load procedure or other facility to load the file soundex.scm.)

Soundex keys are represented as four-character strings:

> (soundex "Smith")
"S530"
> (soundex "Smyth")
"S530"

Therefore, the equal? procedure can be used to compare Soundex keys:

> (equal? (soundex "Johnson") (soundex "Jackson"))
#f
> (equal? (soundex "Johnson") (soundex "JANZEN"))
#t

Both current NARA Soundex and "old" Soundex are supported (soundex is an alias for soundex-nara):

> (soundex-nara "Ashcraft")
"A261"
> (soundex-old "Ashcraft")
"A226"

Multiple Soundex keys based on prefix-skipping can be generated with the soundex-nara/prefixing, soundex-old/prefixing, and soundex/p procedures:

> (soundex/p "vanderlinden")
("V536" "D645" "L535")

Discussion

Discussion here

See Also

 
 
Copyright © 2004 by the contributing authors. All material on the Schematics Cookbook web site is the property of the contributing authors.
The copyright for certain compilations of material taken from this website is held by the SchematicsEditorsGroup - see ContributorAgreement & LGPL.
Other than such compilations, this material can be redistributed and/or modified under the terms of the GNU Lesser General Public License (LGPL), version 2.1, as published by the Free Software Foundation.
Ideas, requests, problems regarding Schematics Cookbook? Send feedback.
/ You are Main.guest