> So, you start out with a system that’s very plastic but not very efficient, and that turns into a system that’s very efficient and not very plastic and flexible.
> It’s interesting that that isn’t an architecture that’s typically been used in AI.
No? That sounds exactly like training a model, then applying the trained model.
Not really. A model (after training) continues to be as plastic as it was before. Correspondingly, it can be made to "forget" what it has "learned" and learn something else. [At least architecture-wise. Whether the learned parameter values reduces plasticity is a non-trivial statement that needs to be demonstrated]
If contemporary AI/ML models behaved like what Alison Gopnik says, then their tendency to overfit would be even more of a problem -- you couldn't even transfer them from simulation domain to reality -- since they would lose all their plasticity overfitting to simulation!
Also, this article contains lots of other interesting ideas to think about. Highly recommend reading all of it.
Only because you have chosen for the learning rate to decrease over time so the model converges. You could reset it.
I would even argue the LR is not relevant to plasticity, it's a meta-variable for training the model and not to do with the model itself.
A plastic model would be one that could accomplish a given task after being trained on it, and then trained on a second task while maintaining the ability to accomplish the first task.
Elman proposed (and I think built) a model in the mid 1990s (see his book: Rethinking Innateness) that works in exactly this manner: A "wave of growth" moves across an initially highly connected "cortical" network, where parts learn (parcel out), and then become fixated, as other, nearby parts learn. You end up with what amounts to the end-result of what would happen if you built a stack of deep-learned transducers with higher order concepts built in top of lower order ones.
> It’s interesting that that isn’t an architecture that’s typically been used in AI.
No? That sounds exactly like training a model, then applying the trained model.