Crucially it will tend to find the simplest such representation that still solve...

skyde · on June 1, 2024

What do you mean by simplest in term of optimization?

I get it find solution that are easy for SGD or Adam optimizer to find.

But why would such solution be less simple than other ?

Reubend · on June 2, 2024

(I could be wrong here. Please correct me if that's the case.)

I think the comment you're replying to means exactly what you're saying, which is that it will find solutions which are "easy" to find for the optimizer, and therefore solutions which are simple to achieve through the convergence of some optimizer.