annotate src/clojure/contrib/combinatorics.clj @ 10:ef7dbbd6452c

added clojure source goodness
author Robert McIntyre <rlm@mit.edu>
date Sat, 21 Aug 2010 06:25:44 -0400
parents
children
rev   line source
rlm@10 1 ;;; combinatorics.clj: efficient, functional algorithms for generating lazy
rlm@10 2 ;;; sequences for common combinatorial functions.
rlm@10 3
rlm@10 4 ;; by Mark Engelberg (mark.engelberg@gmail.com)
rlm@10 5 ;; January 27, 2009
rlm@10 6
rlm@10 7 (comment
rlm@10 8 "
rlm@10 9 (combinations items n) - A lazy sequence of all the unique
rlm@10 10 ways of taking n different elements from items.
rlm@10 11 Example: (combinations [1 2 3] 2) -> ((1 2) (1 3) (2 3))
rlm@10 12
rlm@10 13 (subsets items) - A lazy sequence of all the subsets of
rlm@10 14 items (but generalized to all sequences, not just sets).
rlm@10 15 Example: (subsets [1 2 3]) -> (() (1) (2) (3) (1 2) (1 3) (2 3) (1 2 3))
rlm@10 16
rlm@10 17 (cartesian-product & seqs) - Takes any number of sequences
rlm@10 18 as arguments, and returns a lazy sequence of all the ways
rlm@10 19 to take one item from each seq.
rlm@10 20 Example: (cartesian-product [1 2] [3 4]) -> ((1 3) (1 4) (2 3) (2 4))
rlm@10 21 (cartesian-product seq1 seq2 seq3 ...) behaves like but is
rlm@10 22 faster than a nested for loop, such as:
rlm@10 23 (for [i1 seq1 i2 seq2 i3 seq3 ...] (list i1 i2 i3 ...))
rlm@10 24
rlm@10 25 (selections items n) - A lazy sequence of all the ways to
rlm@10 26 take n (possibly the same) items from the sequence of items.
rlm@10 27 Example: (selections [1 2] 3) -> ((1 1 1) (1 1 2) (1 2 1) (1 2 2) (2 1 1) (2 1 2) (2 2 1) (2 2 2))
rlm@10 28
rlm@10 29 (permutations items) - A lazy sequence of all the permutations
rlm@10 30 of items.
rlm@10 31 Example: (permutations [1 2 3]) -> ((1 2 3) (1 3 2) (2 1 3) (2 3 1) (3 1 2) (3 2 1))
rlm@10 32
rlm@10 33 (lex-permutations items) - A lazy sequence of all distinct
rlm@10 34 permutations in lexicographic order
rlm@10 35 (this function returns the permutations as
rlm@10 36 vectors). Only works on sequences of comparable
rlm@10 37 items. (Note that the result will be quite different from
rlm@10 38 permutations when the sequence contains duplicate items.)
rlm@10 39 Example: (lex-permutations [1 1 2]) -> ([1 1 2] [1 2 1] [2 1 1])
rlm@10 40
rlm@10 41 About permutations vs. lex-permutations:
rlm@10 42 lex-permutations is faster than permutations, but only works
rlm@10 43 on sequences of numbers. They operate differently
rlm@10 44 on sequences with duplicate items (lex-permutations will only
rlm@10 45 give you back distinct permutations). lex-permutations always
rlm@10 46 returns the permutations sorted lexicographically whereas
rlm@10 47 permutations will be in an order where the input sequence
rlm@10 48 comes first. In general, I recommend using the regular
rlm@10 49 permutations function unless you have a specific
rlm@10 50 need for lex-permutations.
rlm@10 51
rlm@10 52 About this code:
rlm@10 53 These combinatorial functions can be written in an elegant way using recursion. However, when dealing with combinations and permutations, you're usually generating large numbers of things, and speed counts. My objective was to write the fastest possible code I could, restricting myself to Clojure's functional, persistent data structures (rather than using Java's arrays) so that this code could be safely leveraged within Clojure's transactional concurrency system.
rlm@10 54
rlm@10 55 I also restricted myself to algorithms that return results in a standard order. For example, there are faster ways to generate cartesian-product, but I don't know of a faster way to generate the results in the standard nested-for-loop order.
rlm@10 56
rlm@10 57 Most of these algorithms are derived from algorithms found in Knuth's wonderful Art of Computer Programming books (specifically, the volume 4 fascicles), which present fast, iterative solutions to these common combinatorial problems. Unfortunately, these iterative versions are somewhat inscrutable. If you want to better understand these algorithms, the Knuth books are the place to start.
rlm@10 58
rlm@10 59 On my own computer, I use versions of all these algorithms that return sequences built with an uncached variation of lazy-seq. Not only does this boost performance, but it's easier to use these rather large sequences more safely (from a memory consumption standpoint). If some form of uncached sequences makes it into Clojure, I will update this accordingly.
rlm@10 60 "
rlm@10 61 )
rlm@10 62
rlm@10 63
rlm@10 64 (ns
rlm@10 65 ^{:author "Mark Engelberg",
rlm@10 66 :doc "Efficient, functional algorithms for generating lazy
rlm@10 67 sequences for common combinatorial functions. (See the source code
rlm@10 68 for a longer description.)"}
rlm@10 69 clojure.contrib.combinatorics)
rlm@10 70
rlm@10 71 (defn- index-combinations
rlm@10 72 [n cnt]
rlm@10 73 (lazy-seq
rlm@10 74 (let [c (vec (cons nil (for [j (range 1 (inc n))] (+ j cnt (- (inc n)))))),
rlm@10 75 iter-comb
rlm@10 76 (fn iter-comb [c j]
rlm@10 77 (if (> j n) nil
rlm@10 78 (let [c (assoc c j (dec (c j)))]
rlm@10 79 (if (< (c j) j) [c (inc j)]
rlm@10 80 (loop [c c, j j]
rlm@10 81 (if (= j 1) [c j]
rlm@10 82 (recur (assoc c (dec j) (dec (c j))) (dec j)))))))),
rlm@10 83 step
rlm@10 84 (fn step [c j]
rlm@10 85 (cons (rseq (subvec c 1 (inc n)))
rlm@10 86 (lazy-seq (let [next-step (iter-comb c j)]
rlm@10 87 (when next-step (step (next-step 0) (next-step 1)))))))]
rlm@10 88 (step c 1))))
rlm@10 89
rlm@10 90 (defn combinations
rlm@10 91 "All the unique ways of taking n different elements from items"
rlm@10 92 [items n]
rlm@10 93 (let [v-items (vec (reverse items))]
rlm@10 94 (if (zero? n) (list ())
rlm@10 95 (let [cnt (count items)]
rlm@10 96 (cond (> n cnt) nil
rlm@10 97 (= n cnt) (list (seq items))
rlm@10 98 :else
rlm@10 99 (map #(map v-items %) (index-combinations n cnt)))))))
rlm@10 100
rlm@10 101 (defn subsets
rlm@10 102 "All the subsets of items"
rlm@10 103 [items]
rlm@10 104 (mapcat (fn [n] (combinations items n))
rlm@10 105 (range (inc (count items)))))
rlm@10 106
rlm@10 107 (defn cartesian-product
rlm@10 108 "All the ways to take one item from each sequence"
rlm@10 109 [& seqs]
rlm@10 110 (let [v-original-seqs (vec seqs)
rlm@10 111 step
rlm@10 112 (fn step [v-seqs]
rlm@10 113 (let [increment
rlm@10 114 (fn [v-seqs]
rlm@10 115 (loop [i (dec (count v-seqs)), v-seqs v-seqs]
rlm@10 116 (if (= i -1) nil
rlm@10 117 (if-let [rst (next (v-seqs i))]
rlm@10 118 (assoc v-seqs i rst)
rlm@10 119 (recur (dec i) (assoc v-seqs i (v-original-seqs i)))))))]
rlm@10 120 (when v-seqs
rlm@10 121 (cons (map first v-seqs)
rlm@10 122 (lazy-seq (step (increment v-seqs)))))))]
rlm@10 123 (when (every? first seqs)
rlm@10 124 (lazy-seq (step v-original-seqs)))))
rlm@10 125
rlm@10 126
rlm@10 127 (defn selections
rlm@10 128 "All the ways of taking n (possibly the same) elements from the sequence of items"
rlm@10 129 [items n]
rlm@10 130 (apply cartesian-product (take n (repeat items))))
rlm@10 131
rlm@10 132
rlm@10 133 (defn- iter-perm [v]
rlm@10 134 (let [len (count v),
rlm@10 135 j (loop [i (- len 2)]
rlm@10 136 (cond (= i -1) nil
rlm@10 137 (< (v i) (v (inc i))) i
rlm@10 138 :else (recur (dec i))))]
rlm@10 139 (when j
rlm@10 140 (let [vj (v j),
rlm@10 141 l (loop [i (dec len)]
rlm@10 142 (if (< vj (v i)) i (recur (dec i))))]
rlm@10 143 (loop [v (assoc v j (v l) l vj), k (inc j), l (dec len)]
rlm@10 144 (if (< k l)
rlm@10 145 (recur (assoc v k (v l) l (v k)) (inc k) (dec l))
rlm@10 146 v))))))
rlm@10 147
rlm@10 148 (defn- vec-lex-permutations [v]
rlm@10 149 (when v (cons v (lazy-seq (vec-lex-permutations (iter-perm v))))))
rlm@10 150
rlm@10 151 (defn lex-permutations
rlm@10 152 "Fast lexicographic permutation generator for a sequence of numbers"
rlm@10 153 [c]
rlm@10 154 (lazy-seq
rlm@10 155 (let [vec-sorted (vec (sort c))]
rlm@10 156 (if (zero? (count vec-sorted))
rlm@10 157 (list [])
rlm@10 158 (vec-lex-permutations vec-sorted)))))
rlm@10 159
rlm@10 160 (defn permutations
rlm@10 161 "All the permutations of items, lexicographic by index"
rlm@10 162 [items]
rlm@10 163 (let [v (vec items)]
rlm@10 164 (map #(map v %) (lex-permutations (range (count v))))))