Ask HN: Do you know what your ML systems are doing?

jverre · on Aug 12, 2019

Great question !

Monitoring models in production is actually quite tricky, especially when the ground truth label is either not available or has a long delay to it (for example if you are prediction the sales forecast for next quarter, you will have a 3 month delay between your prediction and the ground truth label).

Monitoring:

What I have found to work is to track data distributions instead. You can then compare your training distribution to the distribution in real time using the Wasserstein distance [1].

I've also heard of people training auto-encoders on the training data and then tracking the reconstruction error in production. If the data changes substantially, the reconstruction error should increase.

Debugging:

Debugging ML pipelines can be difficult but what I have found to work is to log all input and output features (with a correlation Id so that you can link it to the other systems). A dashboard where you can enter a correlation Id and can see for that request the values for each features overlayed on top of the distribution for that feature in the training set is very valuable.

Hopefully this provides at least some answers ! But it's a very broad topic so could continue talking about this for hours !

When deploying ML models in the past, I've encounter exactly the issues you are talking about and didn't find monitoring solutions .. So I created a Stakion [2] to do just this ! It tracks ML models in realtime and requires just a few lines to integrate no matter how you deploy your models.

[1] https://towardsdatascience.com/life-of-a-model-after-deploym...

[2] https://stakion.io