Anecdotally, I had a fantastic experience with `clojure.spec.alpha` (with or wit...

kragen · on July 5, 2024

it doesn't sound like hypothesis that unable to handles large data sets in this case, though that is indeed not its forte; it sounds like you were rejecting a large proportion of the shrunk instances, so hypothesis would try shrinking by setting a generated integer to zero, in order to see if the bug still existed for zero, and your test would reject it because it had a zero in it. not fail, but reject. for small instances this was just inefficient, but for larger ones it got to the point that hypothesis gave up

someone in that thread suggested that you use a different instance generation strategy that can't generate zeroes, instead of rejecting the instances hypothesis's shrinker most loves to generate, once they are generated. did you try that?

how does clojure.spec.alpha handle this differently?

in mjaniczek's comment at https://news.ycombinator.com/item?id=40876437, they call out your case as a disadvantage of hypothesis's approach:

> The disadvantage is that the generators are now parsers from the lists of bytes that can fail (introducing some inefficiency) and user can make a crazy generator that the internal shrinker will not be able to shrink perfectly. Nevertheless, it's the best DX of the three approaches, ...

though presumably you wouldn't agree that your test was written in a 'crazy' way

kragen · on July 5, 2024

> hypothesis that unable

this should read "hypothesis was unable". we regret the error. those responsible have been sacked

epgui · on July 5, 2024

If hypothesis can’t shrink a result further, shouldn’t I expect it to just return the failing result?

There’s usually always a point where further shrinking is not possible.

kragen · on July 5, 2024

yes, but if it's too hard to generate an instance your test will accept as a valid instance (not passing, valid), i'd instead expect it to tell you to fix your test, so you can get properly shrunk results, and that's what it did. did you try the different instance generation strategy suggested in the thread? how does clojure.spec.alpha handle this differently?

epgui · on July 6, 2024

On the surface, clojure.spec.alpha seems to handle things more or less similarly (you get random results skewing heavily towards small/trivial/edge values), but the main difference is that it can be used and abused without problem. As long as you can write predicates and compose specs, it handles it for you and “just works”.

kragen · on July 7, 2024

did you try the different instance generation strategy suggested in the thread?

you make it sound like clojure.spec.alpha doesn't do shrinking at all. possibly you don't know what shrinking is? shrinking keeps your results from looking "random" by simplifying them until all the simpler versions of the test case pass, so it finds the simplest failing case instead of a "random" one. this also enables hypothesis to be a lot more aggressive at looking for failing cases, for example by generating much larger test data sets than you would want to pore over, because if it finds one that fails, it will shrink it before showing it to you

if i've misunderstood you and clojure.spec.alpha does do shrinking, what does it do in the case where your test case is written to reject all but an exponentially small fraction of shrunk cases? does it just take an exponentially long time to run your test suite? is that what you mean by 'just works'?

kragen · on July 19, 2024

i guess you didn't try the different instance generation strategy suggested in the thread, since i asked three times and you didn't answer three times

ilikehurdles · on July 5, 2024

Yep. I loved clojure’s spec as it was really easy to build around and then I moved to elixir and found that I have to… drop down to an old erlang library (propEr) to write tests like that. Pretty disappointing.

rtpg · on July 5, 2024

the example posted in that github issue is using filter in a way that is causing its own problems.

If you generate stuff randomly, then filter out stuff matching some property, you're basically playing the lottery while generating!

epgui · on July 5, 2024

Are you sure things are executed sequentially in that order? I would expect hypothesis strategies composed together to be lazily evaluated.

In any case, even in the case you describe, I would not expect the strategy to fail with probability 1. Remember that this generates multiple samples, and only one of these need to be valid.