Use random-shuffle list shuffling instead of random >= 1.3 by jisantuc · Pull Request #169 · DataHaskell/dataframe

jisantuc · 2026-02-27T01:49:44Z

Overview

This PR downgrades random from >= 1.3 to between 1.2 and 1.3. random 1.3+ plays poorly with some packages in the ecosystem.
Instead of uniformShuffleListM, it uses shuffle' from random-shuffle, which was already a test dependency.

jisantuc · 2026-02-27T01:55:40Z

Per the contributing guidelines, I tried to add a label (I assumed that was what "A tag (usually feat, documentation, refactor etc)" meant?) but it seems like I'm not allowed to do that 🤔

jisantuc · 2026-02-27T01:56:40Z

src/DataFrame/Operations/Permutation.hs


 shuffledIndices :: (RandomGen g) => g -> Int -> VU.Vector Int
-shuffledIndices pureGen k = VU.fromList (fst (uniformShuffleList [0 .. (k - 1)] pureGen))
+shuffledIndices pureGen k = VU.fromList (shuffle' [0 .. (k - 1)] k pureGen)


Pretty unclear to me how to test this. I thought about a shuffle/un-shuffle identity test, but unshuffle didn't get me very far in search results 😅

Any thoughts?

There are a couple of things I would test then.

test that the shuffling doesn't do anything else than shuffle. So basically sort the shuffled and unshuffled and see if it's equal to the same thing. This ensures that shuffling isn't doing anything else than permuting the indices.

check that shuffling with equivalent seeds result in the same shuffle.

That's about all I can think of

Yeah. No need to round trip. Checking that shuffling preserves length (even when there are duplicates) is probably important. Plus that different seeds are different shuffle orders.

Also on second thought the intermediate list allocation is wasteful. I'll add it as a GSOC task to implement fisher yates here.

that task shouldn't be GSOC, it should be anyone! Also mwc-random does that I think.

Why not? It seems simple and self contained enough since it's reading the algorithm and implementing it through.

there could be some task pipeline for people who are not interested only in GSOC, but more interested generally in contributing!

jisantuc · 2026-02-27T02:27:51Z

I used this branch's package in the plot survey branch I started with no problems other than needing to roll my own list generator with replicateM and state jisantuc/goofing-off@7c9b310#diff-206b9ce276ab5971a2489d75eb1b12999d4bf3843b7988cbe8d687cfde61dea0R4

mchav · 2026-02-27T03:55:01Z

@jisantuc i think there's an issue with the test module name. Once that's fixed this will be good to go.

jisantuc · 2026-02-27T04:18:27Z

@mchav yeah I forgot to rename the module after I understood the test naming convention a little better. Fixed now though

jisantuc · 2026-02-27T05:30:15Z

Per the contributing guidelines, I tried to add a label (I assumed that was what "A tag (usually feat, documentation, refactor etc)" meant?) but it seems like I'm not allowed to do that 🤔

🤦🏻 @mchav I just noticed other commits on main -- you meant a tag in the commit header itself, like in the conventional commits style?

Ai-Ya-Ya · 2026-02-27T13:05:45Z

@mchav What's your timeline on the next Hackage release? nixpkgs CI should pull from there, the hope being it'll be unbroken.

mchav · 2026-02-27T22:17:27Z

@Ai-Ya-Ya released: https://hackage.haskell.org/package/dataframe-0.5.0.0

juhp · 2026-03-02T12:54:52Z

This will prevent dataframe 0.5 from going into Stackage unless you will relax the random bounds...
It is a bounds regression wrt to 0.4.

…#169)

Ai-Ya-Ya · 2026-03-02T15:38:15Z

Nix will not have a problem if random 1.3 is allowed in Cabal, since the default version of random is that of in Stackage (v1.2 for LTS 24). I agree it would be better to relax bounds on random.

random 1.3+ plays poorly with some packages in the ecosystem

@jisantuc Is this just from Nix's diamond dependency problems from 1.3, or is this present in other package environments as well?

jisantuc · 2026-03-02T22:48:38Z

@Ai-Ya-Ya it was just the diamond dependency problem and nix. Because I had to pin random, other things that depended on random needed to be build, nix runs tests in builds, and time-compat's tests pin random < 1.3. If you don't have to build time-compat's tests to build the package, then it should be fine to relax the upper bound.

Also TIL "bounds regression"

Use random-shuffle list shuffling instead of random >= 1.3

b127411

jisantuc commented Feb 27, 2026

View reviewed changes

jisantuc added 2 commits February 26, 2026 18:59

Now with tests

7f57b1f

module rename, woops

184ae00

jisantuc added 3 commits February 26, 2026 20:46

Add new test preserving column names

7f53743

Fix sort order test

f8fdae4

Rename shuffleOnlyShuffles

b1def3d

mchav merged commit 03e34ab into DataHaskell:main Feb 27, 2026
7 checks passed

jisantuc deleted the maint/js/downgrade-random branch February 27, 2026 05:30

juhp added a commit to commercialhaskell/stackage that referenced this pull request Mar 2, 2026

dataframe-0.5 no longer accepts random-1.3 (see DataHaskell/dataframe…

441f661

…#169)

Ai-Ya-Ya mentioned this pull request Mar 2, 2026

relax constraints on random #173

Merged

juhp added a commit to commercialhaskell/stackage that referenced this pull request Mar 3, 2026

allow latest dataframe (DataHaskell/dataframe#169)

bfed9c2

Conversation

jisantuc commented Feb 27, 2026

Overview

Uh oh!

jisantuc commented Feb 27, 2026

Uh oh!

jisantuc Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

daikonradish Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

mchav Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

mchav Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

daikonradish Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

mchav Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

daikonradish Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

jisantuc commented Feb 27, 2026

Uh oh!

mchav commented Feb 27, 2026

Uh oh!

jisantuc commented Feb 27, 2026

Uh oh!

Uh oh!

jisantuc commented Feb 27, 2026

Uh oh!

Ai-Ya-Ya commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mchav commented Feb 27, 2026

Uh oh!

juhp commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Ai-Ya-Ya commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jisantuc commented Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Ai-Ya-Ya commented Feb 27, 2026 •

edited

Loading

juhp commented Mar 2, 2026 •

edited

Loading

Ai-Ya-Ya commented Mar 2, 2026 •

edited

Loading