It is due to one central point: the product is not simply software that *process...

It is due to one central point: the product is not simply software that processes data, the product is a product of data.

Imagine if you write a mobile application that stops working if the mood of the user changes. How much of a headache would you have developing, deploying, and maintaining that kind of apps?

Concrete example: You work on a churn problem. You're good and you have support, so you get the data fast. You produce a great model. That model is perishable. The market changes and the model you trained with the data your client gave you becomes stale and loses its "predictive power". In the simplest scenario, you must get fresh data, and you do it all over again with training, deployment, etc.

One other difference: for normal software development, your stack is pretty much set and you spend most of time using that stack to develop, test, push. In ML, a lot of the effort is in exploration. You want to try a new paper, a new algorithm but that algorithm is only implemented in one library and not the other, and that library conflicts with another one. You want to try as many combinations as possible. This doesn't really happen in standard software development. Components change relatively slowly.

There's also the data problem. Unless one does Kaggle competitions, you don't get to have JSON or CSV in projects. In most cases, you get whatever the client has [emails, powerpoints, files, archives, audio, video, esoteric third party systems you have to interface with without vendor support]. There's no "API" to tap into, and there isn't only one source of data you can build an interface for and call it a day. Hence a lot of custom code to process that.

There are many problems like these. We spend time with applicants who do competitions and imagine that the job is building models to tell them we're not there yet.