It seems to me the biggest challenge for SE transitioning into ML is that ML is a very broad topic and people conflate a lot of roles together. From purely research based questions (backbones, optimizers, initializers etc), to more 'MLOps' like pipelining questions, which tend to fall into the classical engineering / dev ops buckets. So the real question is what type of ML do you want to do?
If you're looking to land a job at FAIR / Deepmind or Google Brain/ Nvidia Research as a researcher or ML scientist the expectations of knowledge are very different than 'data science'. These are research lab groups, that work on pushing the state of the art forward. They are also supported by great engineers, building awesome tools that improve ML research. So transitioning into this sort of role requires more than doing Kaggle competitions, it requires developing an intuition for the respective ML subfield / and trying new things and usually failing. i.e. this is a research role and will require a lot of study and learning
If on the other hand you are looking for datascience / take model and build pipeline to run AI, or perform hyper param sweeps or simply modify some model code, then on I would say that is much more engineering than research ML. This has a much lower barrier to entry coming from engineering and could be a good stepping stone to a transition into pure ML research.
On a more general note to consider when thinking of transitioning to ML is that these systems are probabilistic in nature vs purely deterministic as they are in more general software systems. People (ie humans) are bad at wrapping their heads around distributional processes - you can see this in all fields that deal with them (Quantum vs Classical Physics, Biological Systems etc).
In general I guess what I have seen is when engineers try to dip their toes into ML, what's required is a mindset shift in how to approach problems. Once that happens the depth of that shift determines the type of role with ML you wish to pursue.
It's interesting that you mention this, because there's quite an impressive resurgence of privately funded R&D going on in the ML space. We're in an interesting phase, where the field is moving too fast to have 'canonical' methodologies (though we're getting close). To be an engineer in the deep learning space often requires reading and keeping up with research. Everything just gets dumped on arXiv, because the peer-reviewed publication cycle is almost too slow for the field.
Making a successful transition to ML in my opinion, depends a lot on the individual. Without a strong background in calculus, linear algebra and statistics, it's going to be difficult. Training a model is what people tend to focus on, but in my opinion, that's the easy part. Evaluating/validating a model, analyzing and preparing your data, anticipating model performance, understanding what to do to improve your fit, model selection or architecture. Developing custom deep learning architectures at times requires a bit of an abstract mathematical intuition that I think will suit many engineers very well. A lot of engineers are well equipped to be successful in making a transition, but on the other hand, at least as many aren't.
In the future, I think the field will have many varying degrees of expertise, with the barriers to entry becoming lower all the time. We're reaching a point where some common use cases can be solved adequately in a nearly automated fashion. Some "autoML" tools don't really require any real understanding of ML, though I think it's not wise to get in the habit of using them without understanding how to evaluate a fit. These tools will be great for people who want to occasionally use ML to solve some smaller problems, but as a part of their larger job function.
In some middle area, ML engineers and practitioners will be training and operationalizing models, and keeping up with major developments in research. But there will be some significant changes in the next decade. I predict the nebulous of data science, data analysis and machine learning will become formalized into 3 major skills - exploratory data analysis, machine learning and advanced computational statistics.
At the lowest level, researchers will continue developing the field, which like you say, is probably not something you transition directly into.
If you're looking to land a job at FAIR / Deepmind or Google Brain/ Nvidia Research as a researcher or ML scientist the expectations of knowledge are very different than 'data science'. These are research lab groups, that work on pushing the state of the art forward. They are also supported by great engineers, building awesome tools that improve ML research. So transitioning into this sort of role requires more than doing Kaggle competitions, it requires developing an intuition for the respective ML subfield / and trying new things and usually failing. i.e. this is a research role and will require a lot of study and learning
If on the other hand you are looking for datascience / take model and build pipeline to run AI, or perform hyper param sweeps or simply modify some model code, then on I would say that is much more engineering than research ML. This has a much lower barrier to entry coming from engineering and could be a good stepping stone to a transition into pure ML research.
On a more general note to consider when thinking of transitioning to ML is that these systems are probabilistic in nature vs purely deterministic as they are in more general software systems. People (ie humans) are bad at wrapping their heads around distributional processes - you can see this in all fields that deal with them (Quantum vs Classical Physics, Biological Systems etc).
In general I guess what I have seen is when engineers try to dip their toes into ML, what's required is a mindset shift in how to approach problems. Once that happens the depth of that shift determines the type of role with ML you wish to pursue.