Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Python uses "closed open" intervals with `range(0, n)`, the reverse is then `range(n - 1, -1, -1)`, which is then highly unintuitive. This in connection with 0-based array indexing makes certain algorithms then very cumbersome. For example Knuth-Shuffle. In Python this is:

    from random import randrange
    x = [10, 20, 30, 40, 50 ]
    for i in range(len(x) - 1, 0, -1):
        r = randrange(i + 1)
        x[i], x[r] = x[r], x[i]
    print(x)

With 1-based indexing and inclusive ranges it would be much more understandable:

    a[] = [ 10 20 30 40 50 ]
    for i = len a[] downto 2
        r = random i
        swap a[i] a[r]
    end
    print a[]


In Python, reversed(range(0, n)) (which is also a thing you can write—does that help?) is indeed written range(n-1, -1, -1), but I think I recently encountered a language where that was written as (the local syntax’s equivalent of) range(n, 0, -1), that is to say the rule was not “start inclusive, end exclusive” but “low inclusive, high exclusive”. I’m not sure what language it was and whether this option ultimately ends up more intuitive, but in any case it’s a valid option.

(One difference is that the start-end rule can work polymorphically with equality only, while the low-high rule requires an ordering.)


With Knuth-Shuffle, it would be: for i in reversed(range(1, len(x))): That makes it better, but not much.


A potential issue with reversed(range(...)) is reversed() eagerly constructing the entire range.


Nope: unlike, say, sorted(), reversed() will never force an iterator into a sequence—it will either call __reversed__ or fall back to __len__ and __getitem__. An iterator needs to know how to reverse itself, or it’s out of luck. (Which is quite annoying because it forces you to write a class rather than a generator function.) Ranges are in the former category, they know how to reverse themselves[1]. (Yet for some reason range(0, 42) and range(41, -1, -1) repr themselves as range(0, 42) and range(41, -1, -1) respectively while reversed(range(0, 42)) is instead <range_iterator object at ...>. Truly, I cannot touch anything without finding a bug or at least a suspiciously insectoid lifeform.)

[1] https://github.com/python/cpython/blob/7e3f09cad9b783d8968aa...


Thanks, TIL about the __reversed__ protocol despite having lived in chapter three of the Python reference for some time.


There's one pretty common exception that's "closed closed".

    random.randint(a, b)
    Return a random integer N such that a <= N <= b. Alias for randrange(a, b+1).
https://docs.python.org/3/library/random.html#random.randint

But in reality nowadays you almost always want to use the newer, simpler and more secure "secrets" module.


TIL about the "secrets" module. I have typically used os.urandom() when I needed secure random.


I know Knuth wrote it that way, but there's no good reason why the iteration should be downwards. That is, you could write:

  for i in range(1, len(x))
and you would still get an unbiased shuffle algorithm.

The upwards iteration has a few advantages:

• You can start shuffling before you know how big the input is.

• Algorithm R for reservior sampling can be seen a specialized version of shuffling, in which you skip the work that wouldn't affect the first k items, or would only affect their order.


If you want to do it the other way around, you have to start at 0:

    for i in range(len(x) - 1):
        r = randrange(i, len(x))
        x[i], x[r] = x[r], x[i]


That probably works too, but it's even more awkard than the Knuth's one, and doesn't have the nice properties I mentioned.

I meant just:

    for i in range(1, len(x)):
        r = randrange(i + 1)
        x[i], x[r] = x[r], x[i]


Cool. Seems to work, but I don't think it's still Knuth Shuffle.


You can think of it as Knuth's unshuffle. :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: