Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If a distribute system relies on clients gracefully exiting to work the system will eventually break badly.


And i believe that so much that I don't even consider graceful shutdown in design. Components should be able to safely (and even frequently) hard-crash and so long as a critical percentage of the system is WAI then it shouldn't meaningfully impact the overall system.

The only way to make sure a system can handle components hard crashing, is if hard crashing is a normal thing that happens all the time.

All glory to the chaos monkey!


There's a big gap between graceful shutdown to be nice to clients / workflows, and clients relying on it to work.


Way back when, in physical land - I used STONITH for that! https://smcleod.net/2015/07/delayed-serial-stonith/


There's valid reasons to want the typical exit not to look like a catastrophic one even if that's a recoverable situation.

That my application went down from sig int makes a big difference compared to kill.

Blue-Green migrations for example require a graceful exit behavior.


> Blue-Green migrations for example require a graceful exit behavior.

it may not always be necessary. e.g. if you are deploying a new version of a stateless backend service, and there is a load balancer forwarding traffic to current version and new version backends, the load balancer could be responsible for cutting over, allowing in flight requests to be processed by the current version backends while only forwarding new requests to the new backends. then the old backends could be ungracefully terminated once the LB says they are not processing any requests.


Yeah. However, I do not need to pull the plug to shut things down even if the software was designed to tolerate that.

In a second thought though, maybe I do. That might be the only way to ensure the assumption is true. Like the Netflix's chaos monkey thing a couple years ago.


> Like the Netflix's chaos monkey thing a couple years ago.

That was released 15 years ago.


Thanks for reminding how old I am.


Relying on graceful exit and supporting it are two different things. You want to support it so you can stop serving clients without giving them nasty 5xx errors.


No one said that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: