Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I am familiar with an algorithm that stably brings a disjoint selection of items together around a specified point. Sounds similar to this case, where the disjoint selection are changes that happened to a given file.

The name of the algorithm is “gather”, by Sean Parent and Marshall Clow.



https://github.com/stlab/adobe_source_libraries/blob/7659244...

https://listarchives.boost.org/Archives/boost/2013/01/200366...

I gotta say, I don't see the greatness any more than most of the repliers in that Boost thread — it's just two stable_partitions in a row.

"[...] Or is there some optimization that gather provides over (stable_)partition? —— Nope. [...]"


The Boost thread starts with an example of how Bjarne replaced a bunch of complicated code with it.

It may be just two stable partitions, but “just” is doing a lot of work there. The algorithm becomes obvious once someone has identified it.


The talk: https://www.youtube.com/watch?v=OB-bdWKwXsU&t=52m49s

Sadly the 25-line original code isn't presented; the code that is presented is the 5-line replacement using the STL's `find_if` and `rotate`. Bjarne sketches the idea that those five lines can be further condensed into two lines with the non-STL `gather` algorithm:

    auto dest = std::find_if(v.begin(), v.end(), contains(p));
    stdx::gather(v.begin(), dest, v.end(), [](const auto& elt) { return &elt == &*source; });
But this is overkill — replacing an O(distance(source,dest)) non-allocating rotate with an O(v.size()) potentially-allocating stable_partition — and more importantly it re-complicates the code.

Now, I think part of his point is that `stable_partition` is "simpler" than `gather` only because it's in the STL. If we add `gather` to the STL too and everyone learns what it means, then there's no objection to using `gather` for "simplification" like this: it would be a straightforward simplification in almost the same way that `std::equal_range(first, last, x)` is a straightforward simplification of `std::make_pair(std::lower_bound(first, last, x), std::upper_bound(first, last, x))`.

The "almost" is that actually there is an algorithmic advantage to `std::equal_range`: when you're looking for the upper bound, you don't have to consider any of the elements to the left of the lower bound you already found. You get a (very slight) performance boost by using the combined `equal_range` algorithm. `gather`, on the other hand, has no such advantage; and (as we've seen) has a (very slight) performance disadvantage when compared to the `rotate` that Bjarne's correspondent's code actually required.

We're not talking about replacing 25 lines of bespoke code with 1 line of Boost `gather`; we're talking about replacing 2 lines of STL `stable_partition` with 1 line of Boost `gather`. The former is probably worth it. The latter is not.


"muster" comes to mind and is different than "gather".




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: