Can it approach an optimum in time linear in the number of dimensions like gradi...

bertday · on March 29, 2022

The greedy approach has an 1-1/e type of approximatation. You basically prove that the problem is in the class and then solve it greedily. It’s not much different than proving the problem is convex and then using gradient descent.

kragen · on March 30, 2022

People mostly use gradient descent to "solve" nonconvex problems.

bertday · on March 31, 2022

There isn’t much theory on why it works in the nonconvex case. You can similarly make discrete heuristic algorithms for sorting, etc. with no guarantees.

Anyway, as nonconvex models are usually stacked convex models, you can find works that incorporate these submodular functions as neural network layers.