/ Cookbook.ListFlatten

500 Can't connect to 127.0.0.1:8778 (connect: Connection refused)

The solution presented is a non-destructive function that operates very quickly. For instance, it out-performs this popular implementation of flatten:

500 Can't connect to 127.0.0.1:8778 (connect: Connection refused)

Notice that this version of flatten utilizes the destructive function `reverse!'. However, it is being used on the newly created list, and not the input list. Thus, this function is not classified as destructive, since its argument remains unchanged.

The fastest way to flatten a list non-destructively, that the author is aware of, happens to be the algorithm shown above, without reversing the final result. Thus, if the order of elements of the list to be flattened is of no significance to your particular requirement to flatten a list, this author recommends that you use the function as implemented above.

However, if the order of elements is important to you, then you will find the following function to be very fast, and robust in its ability to handle various structures of lists and size, without degrading its performance by more than a small factor. The algorithm implemented below operates in linear time with respect to the number of pair?'s in the list, in other words it has complexity O(n).

500 Can't connect to 127.0.0.1:8778 (connect: Connection refused)

You'll notice this function also employs a destructive function, this time, the primitive `set-cdr!'. However, like the implementation above, the destructive primitive is used in the building of the newly created list, and not on its argument, so this function is non-destructive. You may also notice that it employs an odd initial value for the list being created; e.g., the pair? (cons '() '()). This is done to avoid an expensive special case for the very first element added to the list being built, that otherwise would be need to be included within the internal function f, where all the recursion takes place.

The overview of this function's algorithm is to avoid the need to reverse the result by maintaining a lexically bound variable, that always remains bound to the value of the last pair? seen, and in producing exactly one new cons pair for every pair in its argument, plus the single initial cons pair described above. The author has benchmarked this implementation of flatten with a variety of list structures, including a deeply nested list, a list having the shape of a binary search tree and a completely flat list. The flat list is of particular interest, because the author found other implementations to be fragile when they were handed the answer as their argument.

The benchmarks were performed using DrScheme in no debug mode, as well as being repeated with the Chicken Scheme to C compiler. The ranking of the various algorithms changed somewhat between the slower implementations, but did not affect that of the three fastest algorithms. Lists lengths of 2,000,000 were possible with DrScheme. Note, the length merely represents the size of the cdr spine of a list. The actual lists contained substantially more elements than reported by the `length' function, due to nesting within the lists. For technical reasons, I was not able to test beyond lists of length 1,000,000 using the Chicken compiler. This, no doubt, is the result of my inexperience with the compiler.

The author would be very appreciative of the results of other benchmarks, especially on platforms other than DrScheme. Please add your comments and results if you are interested in comparing these functions and results on any platform, and especially if you know of an implementation of flatten which you believe to be faster.

The author is grateful to Robby Findler, who provided a critique of my code, which led to writing the implementation of flatten presented in this recipe.

-- KyleSmith - 03 Jun 2007

CookbookForm | |
---|---|

TopicType: | Recipe |

ParentTopic: | ListRecipes |

TopicOrder: | 999 |

The copyright for certain compilations of material taken from this website is held by the SchematicsEditorsGroup - see ContributorAgreement & LGPL.

Other than such compilations, this material can be redistributed and/or modified under the terms of the GNU Lesser General Public License (LGPL), version 2.1, as published by the Free Software Foundation.

Ideas, requests, problems regarding Schematics Cookbook? Send feedback. / You are Main.guest