And i believe that so much that I don't even consider graceful shutdown in design. Components should be able to safely (and even frequently) hard-crash and so long as a critical percentage of the system is WAI then it shouldn't meaningfully impact the overall system.
The only way to make sure a system can handle components hard crashing, is if hard crashing is a normal thing that happens all the time.
> Blue-Green migrations for example require a graceful exit behavior.
it may not always be necessary. e.g. if you are deploying a new version of a stateless backend service, and there is a load balancer forwarding traffic to current version and new version backends, the load balancer could be responsible for cutting over, allowing in flight requests to be processed by the current version backends while only forwarding new requests to the new backends. then the old backends could be ungracefully terminated once the LB says they are not processing any requests.
Yeah. However, I do not need to pull the plug to shut things down even if the software was designed to tolerate that.
In a second thought though, maybe I do. That might be the only way to ensure the assumption is true. Like the Netflix's chaos monkey thing a couple years ago.
Relying on graceful exit and supporting it are two different things. You want to support it so you can stop serving clients without giving them nasty 5xx errors.