Coming from Square (which is mostly SOA, but with an old monolithic service), we had quite a few services which really needed to be separated for performance reasons:
- One needed a large in-process cache in order to deliver good performance; it would have consumed to much memory on each instance of the monolith.
- Some services used large ML models, which also would have consumed too much of memory on monolith instances.
- A lot of our payment-related services had hourly or daily batch jobs. Anything with big resource spikes probably shouldn't share a machine with latency-sensitive code (like online payment processing or just web handlers).
- Related to the above, some jobs had to be done by a master instance. If the monolith did them, they would have disproportionately affected a single instance of the monolith.
- One needed a large in-process cache in order to deliver good performance; it would have consumed to much memory on each instance of the monolith.
- Some services used large ML models, which also would have consumed too much of memory on monolith instances.
- A lot of our payment-related services had hourly or daily batch jobs. Anything with big resource spikes probably shouldn't share a machine with latency-sensitive code (like online payment processing or just web handlers).
- Related to the above, some jobs had to be done by a master instance. If the monolith did them, they would have disproportionately affected a single instance of the monolith.