To me, writing in full, formally correct sentences, being careful to always use correct punctuation, starts to feel a little pretentious or tryhard in some contexts.
It doesn't feel too much like that here on HN. But on reddit, I use less formal structure most of the time, and that feels natural to me.
> If your database goes down at 3 AM, you need to fix it.
Of all the places I've worked that had the attitude "If this goes down at 3AM, we need to fix it immediately", there was only one where that was actually justifiable from a business perspective. I'm worked at plenty of places that had this attitude despite the fact that overnight traffic was minimal and nothing bad actually happened if a few clients had to wait until business hours for a fix.
I wonder if some of the preference for big-name cloud infrastructure comes from the fact that during an outage, employees can just say "AWS (or whatever) is having an outage, there's nothing we can do" vs. being expected to actually fix it
From this perspective, the ability to fix problems more quickly when self hosting could be considered an antifeature from the perspective of the employee getting woken up at 3am
No. You sit on the call and wait to restore your service to your users. There’s bullshit toil in disabling scale in as the outage gets longer.
Eventually, AWS has a VP of something dial in to your call to apologize. They’re unprepared and offer no new information. The get handed to a side call for executive bullshit.
AWS comes back. Your support rep only vaguely knows what’s going on. Your system serves some errors but digs out.
Really? That might be an anecdote sampled from unusually small businesses, then. Between myself and most peers I’ve ever talked to about availability, I heard an overwhelming majority of folks describe systems that really did need to be up 24/7 with high availability, and thus needed fast 24/7 incident response.
That includes big and small businesses, SaaS and non-SaaS, high scale (5M+rps) to tiny scale (100s-10krps), and all sorts of different markets and user bases. Even at the companies that were not staffed or providing a user service over night, overnight outages were immediately noticed because on average, more than one external integration/backfill/migration job was running at any time. Sure, “overnight on call” at small places like that was more “reports are hardcoded to email Bob if they hit an exception, and integration customers either know Bob’s phone number or how to ask their operations contact to call Bob”, but those are still environments where off-hours uptime and fast resolution of incidents was expected.
Between me, my colleagues, and friends/peers whose stories I know, that’s an N of high dozens to low hundreds.
IME the need for 24x7 for B2B apps is largely driven by global customer scope. If you have customers in North American and Asia, now you need 24x7 (and x365 because of little holiday overlap).
That being said, there are a number of B2B apps/industries where global scope is not a thing. For example, many providers who operate in the $4.9 trillion US healthcare market do not have any international users. Similarly the $1.5 trillion (revenue) US real estate market. There are states where one could operate where healthcare spending is over $100B annually. Banks. Securities markets. Lots of things do not have 24x7 business requirements.
I’ve worked for banks, multiple large and small US healthcare-related companies, and businesses that didn’t use their software when they were closed for the night.
All of those places needed their backend systems to be up 24/7. The banks ran reports and cleared funds with nightly batches—hundreds of jobs a night for even small banking networks. The healthcare companies needed to receive claims and process patient updates (e.g. your provider’s EMR is updated if you die or have an emergency visit with another provider you authorized for records sharing—and no, this is not handled by SaaS EMRs in many cases) over night so that their systems were up to date when they next opened for business. The “regular” businesses closed for the night generated reports and frequently had IT staff doing migrations, or senior staff working on something at midnight due the next day (when the head of marketing is burning the midnight oil on that presentation, you don’t want to be the person explaining that she can’t do it because the file server hosting the assets is down all the time after hours).
And again, that’s the norm I’ve heard described from nearly everyone in software/IT that I know: most businesses expect (and are willing to pay for or at least insist on) 24/7 uptime for their computer systems. That seems true across the board: for big/small/open/closed-off-hours/international/single-timezone businesses alike.
You are right that a lot of systems at a lot of places need 24x7. Obviously.
But there are also a not-insignificant number of important systems where nobody is on a pager, where there is no call rotation[1]. Computers are much more reliable than they were even 20 years ago. It is an Acceptable Business Choice to not have 24x7 monitoring for some subset of systems.
Until very recently[2], Citibank took their public website/user portal offline for hours a week.
1 - if a system does not have a fully staffed call rotation with escalations, it's not prepared for a real off-hours uptime challenge
2 - they may still do this, but I don't have a way to verify right now.
This lasts right up until an important customer can't access your services. Executives don't care about downtime until they have it, then they suddenly care a lot.
You can often have services available for VIPs, and be down for the public.
Unless there's a misconfiguration, usually apps are always visible internally to staff, so there's an existing methodology to follow to make them visible to VIPs.
But sometimes none of that is necessary. I've seen at a 1B market cap company, a failure case where the solution was manual execution by customer success reps while the computers were down. It was slower, but not many people complained that their reports took 10 minutes to arrive after being parsed by Eye Ball Mk 1s, instead of the 1 minute of wait time they were used to.
Uptime is also a sales and marketing point, regardless of real-world usage. Business folks in service-providing companies will usually expect high availability by default, only tempered by the cost and reality of more nines.
Also, in addition to perception/reputation issues, B2B contracts typically include an SLA, and nobody wants to be in breach of contract.
I think the parent you're replying to is wrong, because I've worked at small companies selling into large enterprise, and the expectation is basically 24/7 service availability, regardless of industry.
I would say something like 95% of the code I have been paid to write as a software engineer has 0% test coverage. Like, literally, not a single test on the entire project. Across many different companies and several countries, frontend and backend.
I wonder if I'm an anomaly, or if it's actually more common that one might assume?
Once you realize that automated testing gives you a level of confidence throughout iteration that you can't replicate through manual interaction (nor would you want to), you never go back.
It's just a matter of economics. Where the costs of bugs in production is low, which is probably the vast majority of the software out there, extensive test coverage simply doesn't make economic sense. Something breaks in some niche app, maybe someone is bothered enough to complain about it, it gets fixed at some point, and everybody moves on.
Where the costs are high, like say in safety critical software or large companies with highly paid engineers on-call where 9s of uptime matters, the amount of testing and development rigor naturally scale up.
This is why rigid stances like that from "Uncle Bob" are shortsighted: they have no awareness of the actual economics of things.
Way more common. Tests are at best overrated. And doing them properly is big PITA. The first thing is that the person writing the tests and the person writing the code should be different. And our languages are not really suited for the requirements of testing. They can and do save your ass in certain situation, but the false security they provide is probably more dangerous.
This sounds very "the perfect is the enemy of good". Tests don't need to be perfect, they don't need to be written by different people (!!!), they don't need to cover 100% of the code. As long as they're not flakey (tests which fail randomly really can he worse than nothing) it really helps in development and maintenence to have some tests. It's really nice when the (frequent) mistakes I make show up on my machine or on the CI server rather than in production, and my (very imperfect, not 100% "done properly") tests account for a lot of those catches.
Obviously pragmatism is always important and no advice applies to 100% of features/projects/people/companies. Sometimes the test is more trouble to write than it's worth and TDD never worked for me with the exception of specific types of work (good when writing parsers I find!).
From my experience though I often do make logical errors in my code but not in my tests and I do frequently catch errors because of this. I think thats a fairly normal experience with writing automated tests.
Would having someone else write the tests catch more logical errors? Very possibly, I haven't tried it but that sounds reasonable. It also does seem like that (and the other things it implies) would be a pretty extreme change in the speed of development. I can see it being worth it in some situations but honestly I don't see it as something practical for many types of projects.
What I don't understand is saying "well we can't do the really extremely hard version so let's not do the fairly easy version" which is how I took you original comment.
You get visibility into your usage, and you're seeing if you're exceeding the usage. They recommend to use plans if your typical traffic is 'only' up to 50TB per month. Occasional spikes are fine from what I understand.
This is not true. Even the Free plan has DDoS protection. L3/L4 (TCP SYN floods, UDP reflection attacks and similar) filtering is built-in and always-on, by default. CloudFront terminates TLS, and only forwards valid HTTP(S) requests to cache / origin.
The "Always-on DDoS Protection" on L7 is protection against massive requests spikes, built natively into CloudFront. Detection and mitigation of these attacks happens inline.
Almond milk is not dairy milk, but it is absolutely "milk", in the sense of a white liquid derived from plants - a definition that has existed in English for hundreds of years.
The name "almond milk" has been used since at least the 1500s.
The problem is digesting that quantity of food, not the energy content. Elite athletes typically eat some potatoes but most of what they eat is more nutrient dense.
Seriously guys, get out your scale and weigh 13 pounds of potatoes. Could you really consume that much volume in a day without feeling sick? Let's do a reality check here.
About 6 weeks into a cross country bike tour, I spent a rest day eating all day. I think I ate 4 massive burgers and a large supreme pizza. Probably somewhere around 6000 calories.
A week prior, I ran out of food in the mountains. I finally got to a store, bought a loaf of bread, a pack of Oscar-Meyer bologna, a pack of cheese slices, two sodas, and a red bull. When I left the bench near the store, I had a few slices of bread left.
When you put out incredible amounts of energy, you can eat a fairly incredible amount of food. I don't understand where the food goes, it really doesn't feel like you should be able to eat that much volume but you can.
Those foods you ate are all more calorically dense than plain potatoes so my point still stands. It's not about the calories but the total volume of food that the human gut can process in a day. Have you seen 13 pounds of potatoes?
Potatoes are ~0.7kg/L, so that's ~8.4L of potato over the course of the day & 370 grams per waking hour, which is one big potato.
Yes my examples were somewhat high energy density but in the second example, I probably had 3L+ in my belly in just one meal. Honestly I think your body just gets used to it.
I'm not familiar with the 13 lb of potato claim, but it strikes me as a stretch (hah!) but not inherently implausible.
I had a HS friend who was a serious swimmer (not quite Olympic level but he won state championships) and watching him eat was insane. He would eat about 3x of what we all ate. Like literally down 3 sandwiches while we had one. I think he was on a 6000 calorie a day diet. I believe the potato thing. It sure sits outside what I think I could eat, but having seen others do similar, it seems realistic.
I do weightlifting and have gone through times where I do more weightlifting and less. The body just absorbs food when you do more work. Can't tell you the mechanics but it's way easier for me to digest more food when I do more weightlifting.
reply