s c h e m a t i c s : c o o k b o o k

/ Cookbook.ListFlatten

This Web


WebHome 
WebChanges 
TOC (with recipes)
NewRecipe 
WebTopicList 
WebStatistics 

Other Webs


Chicken
Cookbook
Erlang
Know
Main
Plugins
Sandbox
Scm
TWiki  

Schematics


Schematics Home
Sourceforge Page
SchemeWiki.org
Original Cookbook
RSS

Scheme Links


Schemers.org
Scheme FAQ
R5RS
SRFIs
Scheme Cross Reference
PLT Scheme SISC
Scheme48 SCM
MIT Scheme scsh
JScheme Kawa
Chicken Guile
Bigloo Tiny
Gambit LispMe
GaucheChez

Lambda the Ultimate
TWiki.org

How to flatten a list.

Problem

To flatten a list means to destructure a list into a flat list. For example:

500 Can't connect to 127.0.0.1:8778 (connect: Connection refused)

The solution presented is a non-destructive function that operates very quickly. For instance, it out-performs this popular implementation of flatten:

500 Can't connect to 127.0.0.1:8778 (connect: Connection refused)

Notice that this version of flatten utilizes the destructive function `reverse!'. However, it is being used on the newly created list, and not the input list. Thus, this function is not classified as destructive, since its argument remains unchanged.

Solution

The fastest way to flatten a list non-destructively, that the author is aware of, happens to be the algorithm shown above, without reversing the final result. Thus, if the order of elements of the list to be flattened is of no significance to your particular requirement to flatten a list, this author recommends that you use the function as implemented above, with the exception that the application of `reverse!' is removed.

However, if the order of elements is important to you, then you will find the following function to be very fast, and robust in its ability to handle various structures of lists and size, without degrading its performance by more than a small factor. The algorithm implemented below operates in linear time with respect to the number of elements in the list, in other words it has complexity O(n).

500 Can't connect to 127.0.0.1:8778 (connect: Connection refused)

Two new functions to flatten a list were provided by readers of the authors blog. They showed exceptional performance, in many cases beating the function described above. In the interest of making this recipe more thorough in its coverage of fast algorithms available to flatten a list, I present these two functions. I am not the author of either of these functions, but they were placed into the public domain by their publication on the authors blog. I will refrain from comparing their relative performances on this page, as that has been discussed at great length in their initial publication. The interested reader need only to a quick web search on Scheme, flatten, benchmark, algorithms to find the benchmark results and analysis. It is not appropriate to advertise my blog on this page by giving an explicit URL.

The first function was contributed by Eugene Wallingford, and is an elegant solution that employs two mutually recursive helper functions to structure its result in the order of visitation of its input argument. Trivial liberty with the authors code has been taken in order to present it as a single function, with the helper functions being internally bound by letrec, and minor changes to argument naming, to reflect its ability to operate on heterogeneous lists, since the function, as first presented, was designed specifically for lists of symbol?'s.

500 Can't connect to 127.0.0.1:8778 (connect: Connection refused)

The second function was contributed as a derivation of this authors function, which appears above. It is named flatten-Sacundim, after the only name this functions author provided. Sacundim uses an ingenious technique, specifically to eliminate all non-tail recursive calls found in the function originally presented in this recipe (compare with the code above.)

500 Can't connect to 127.0.0.1:8778 (connect: Connection refused)

Discussion

The following is the original discussion, prior to the contribution of the two new functions presented today, 2006-06-26. As mentioned above, there will be no additional discussion added by this author, as it would be redundant to the analysis given when they were first published on the authors blog. If someone else would like to contribute to the discussion, it would be warmly received.

You'll notice this function also employs a destructive function, this time, the primitive `set-cdr!'. However, like the implementation above, the destructive primitive is used in the building of the newly created list, and not on its argument, so this function is non-destructive. You may also notice that it employs an odd initial value for the list being created; e.g., the pair? (cons '() '()). This is done to avoid an expensive special case for the very first element added to the list being built, that otherwise would need to be included within the internal function f, where all the recursion takes place.

The overview of this function's algorithm is to avoid the need to reverse the result by maintaining a lexically bound variable, that always remains bound to the value of the last pair? seen, and in producing exactly one new cons pair for every non-pair in its argument, plus the single initial cons pair described above. The author has benchmarked this implementation of flatten with a variety of list structures, including a deeply nested list, a list having the shape of a binary search tree and a completely flat list. The flat list is of particular interest, because the author found other implementations to be fragile when they were handed the answer as their argument.

The benchmarks were performed using DrScheme in no debug mode, as well as being repeated with the Chicken Scheme to C compiler. The ranking of the various algorithms changed somewhat between the slower implementations, but did not affect that of the three fastest algorithms. Lists lengths of 2,000,000 were possible with DrScheme. Note, the length merely represents the size of the cdr spine of a list. The actual lists contained substantially more elements than reported by the `length' function, due to nesting within the lists. For technical reasons, I was not able to test beyond lists of length 1,000,000 using the Chicken compiler. This, no doubt, is the result of my inexperience with the compiler. NOTE: 2006-06-26, this limitation was confirmed, by Felix of Chicken, to be a legitimate limitation. However, as the author has now concluded extensive tests with the Chicken compiler, be assured that this was clearly a conscious design choice made by the authors of Chicken, the result of which is that the Chicken compiler produces code which operates quite linearly on those lists for which it can accommodate.

The author would be very appreciative of the results of other benchmarks, especially on platforms other than DrScheme. Please add your comments and results if you are interested in comparing these functions and results on any platform, and especially if you know of an implementation of flatten which you believe to be faster.


Comments about this recipe

Contributors

The author is grateful to Robby Findler, who provided a critique of my code, which led to writing the implementation of flatten presented in this recipe. Additionally, now that their are two (2) more examples of excellent functions provided by readers of the authors blog, the author wishes to thank Eugene Wallingford and Sacundim for contributing these very fast functions, which take some of the luster off the original function presented in this recipe, under conditions favorable to the newly presented functions.

-- KyleSmith - 26 Jun 2007

CookbookForm
TopicType: Recipe
ParentTopic: ListRecipes
TopicOrder: 999

 
 
Copyright © 2004 by the contributing authors. All material on the Schematics Cookbook web site is the property of the contributing authors.
The copyright for certain compilations of material taken from this website is held by the SchematicsEditorsGroup - see ContributorAgreement & LGPL.
Other than such compilations, this material can be redistributed and/or modified under the terms of the GNU Lesser General Public License (LGPL), version 2.1, as published by the Free Software Foundation.
Ideas, requests, problems regarding Schematics Cookbook? Send feedback.
/ You are Main.guest