`class Array`

def shuffle!

(1...size).each do |j|

r = rand(j+1)

self[r], self[j] = at(j), at(r)

end

self

end

def shuffle

dup.shuffle!

end

end

`at`

is supposed to be a bit faster than `[]`

, but it is rhs only.

minimize the arithmetic inside the loop.

minimize the arithmetic outside the loop.

allow the user to decide if he wants to pay for the `dup`

or not.

After a bunch of thinking, it can be proved that *no* sort algorithm will make shuffle_sort work (meaning that this isn’t just the details of quicksort that muck things up; it won’t work for mergesort or anything else):

If the sort algorithm makes k comparisons (i.e. k calls to rand), then there are 2^k possible executions, which can’t be *uniformly* mapped onto the n! possible shuffled outputs.

(As I write that, I realize it’s the same idea made about a non-sorting algorithm, on the post you linked to.)

]]>Also see http://www.codinghorror.com/blog/archives/001015.html if you want more proof.

]]>arr.sort{rand(3)-1}

really gives a uniform distribution? It seems to depend on the details of the sort.

(For example, if we were to use bubblesort, the probability that the first element ends up last is only 1/2^n, instead of 1/n. So for quicksort, I need some convincing that the results really are uniform.)

(Also, at the very least, that cute-but-inefficient version would want to use

arr.sort{ 2*rand(2)-1 }

)

delicious’ed.

]]>Consider two functions: F(n) = A * n and G(n) = B * n * n, where A and B could be arbitrarily large, but constant numbers. Do you need to know A and B values to tell that given n big enough, G(n) > F(n)?

Or, more formally, we can easily prove that there is some N that for every n > N, G(n) > F(n) holds. Which means that it holds for “almost all” numbers. And this is enough for algorithmic theory to say “G is slower(bigger) than F” because the theory compares their asymptotic complexities, that is where n increases to infinity.

]]>