s c h e m a t i c s : c o o k b o o k

/ Cookbook.StringSplit

This Web


WebHome 
WebChanges 
TOC (with recipes)
NewRecipe 
WebTopicList 
WebStatistics 

Other Webs


Chicken
Cookbook
Erlang
Know
Main
Plugins
Sandbox
Scm
TWiki  

Schematics


Schematics Home
Sourceforge Page
SchemeWiki.org
Original Cookbook
RSS

Scheme Links


Schemers.org
Scheme FAQ
R5RS
SRFIs
Scheme Cross Reference
PLT Scheme SISC
Scheme48 SCM
MIT Scheme scsh
JScheme Kawa
Chicken Guile
Bigloo Tiny
Gambit LispMe
GaucheChez

Lambda the Ultimate
TWiki.org

string-split: splitting a string into substrings

Problem

Split a string into words separated by whitespace or other delimiters. See the function split in Python or Perl.

Examples

         (string-split " abc d e f  ")       ==> ("abc" "d" "e" "f")
         (string-split " abc d e f  " '() 1) ==> ("abc d e f  ")
         (string-split " abc d e f  " '() 0) ==> ()
         (string-split ":" '(#\:))           ==> ("" "")
         (string-split ":abc:d:e::f:" '(#\:))
               ==> ("" "abc" "d" "e" "" "f" "")
         (string-split "root:x:0:0:Lord" '(#\:) 2)
               ==> ("root" "x:0:0:Lord")
         (string-split "/usr/local/bin:/usr/bin:/usr/ucb/bin" '(#\:))
               ==> ("/usr/local/bin" "/usr/bin" "/usr/ucb/bin")
         (string-split "/usr/local/bin" '(#\/))
               ==> ("" "usr" "local" "bin")

Specification

         string-split STRING              -> STRINGS
         string-split STRING '()          -> STRINGS
         string-split STRING '() MAXSPLIT -> STRINGS
These procedures return a list of whitespace-delimited words in STRING. Leading and trailing whitespaces of the words are trimmed. If STRING is empty or contains only whitespace, the empty list is returned.

         string-split STRING CHARSET          -> STRINGS
         string-split STRING CHARSET MAXSPLIT -> STRINGS
These procedures return a list of words in STRING delimited by the characters in CHARSET. The latter is a list of characters to be treated as delimiters. Leading or trailing delimiters of the words are not trimmed. That is, the resulting list will have as many initial empty string elements as there are leading delimiters in STRING.

If MAXSPLIT is specified and positive, the resulting list will contain at most MAXSPLIT elements, the last of which is the string remaining after (MAXSPLIT - 1) splits. If MAXSPLIT is specified and non-positive, the empty list is returned. "In time critical applications it behooves you not to plit into more fields than you really need."

Solution

500 Can't connect to 127.0.0.1:8778 (connect: Connection refused)

Here inc is a macro or a function that returns the incremented argument. On many Scheme systems, it can be implemented more efficiently than merely (+ 1 x) if we assume that x is a fixnum.

http://pobox.com/~oleg/ftp/Scheme/util.html#string-split

Discussion


Comments about this recipe

Replace memq by memv if necessary. It's probably necessary if (eq? #\A #\A) ==> #f on your Scheme system. -- JohnRussell - 02 Mar 2005

If you only need to split on 1 character and you never need to split portions of the string, here's a simpler and potentially more efficient function:

500 Can't connect to 127.0.0.1:8778 (connect: Connection refused)

-- DougHoyte - 20 Apr 2006

The version above by DougHoyte doesn't follow the original spec. entirely. Namely it eats the delimiters:

500 Can't connect to 127.0.0.1:8778 (connect: Connection refused)

This version doesn't eat them:

500 Can't connect to 127.0.0.1:8778 (connect: Connection refused)

Example:

500 Can't connect to 127.0.0.1:8778 (connect: Connection refused)

-- NoelWelsh - 19 May 2006

To understand the (delimiter-eating) version above by DougHoyte - 20 Apr 2006 I added comments etc:

500 Can't connect to 127.0.0.1:8778 (connect: Connection refused)

500 Can't connect to 127.0.0.1:8778 (connect: Connection refused)

-- GeorgeHerson - 18 Mar 2007

Perhaps it's worth mentioning REGEXP-SPLIT which MzScheme provides. 500 Can't connect to 127.0.0.1:8778 (connect: Connection refused)

-- ChrisDone - 03 Sep 2007

There is also string-tokenize, which is provided by SRFI 13: 500 Can't connect to 127.0.0.1:8778 (connect: Connection refused)

SRFI 14 provides ways to change the splitting characters, by default it splits on whitespace.

500 Can't connect to 127.0.0.1:8778 (connect: Connection refused)

Note that it too, consumes the splitting characters.

-- MarekKubica - 13 Sep 2008

Contributors

-- OlegK - 14 Sep 2004

CookbookForm
TopicType: Recipe
ParentTopic: StringRecipes
TopicOrder: 999

 
 
Copyright © 2004 by the contributing authors. All material on the Schematics Cookbook web site is the property of the contributing authors.
The copyright for certain compilations of material taken from this website is held by the SchematicsEditorsGroup - see ContributorAgreement & LGPL.
Other than such compilations, this material can be redistributed and/or modified under the terms of the GNU Lesser General Public License (LGPL), version 2.1, as published by the Free Software Foundation.
Ideas, requests, problems regarding Schematics Cookbook? Send feedback.
/ You are Main.guest