I think you underestimate how many people work at large companies that have to d...

phil21 · on July 14, 2018

I think you highly overestimate it. Most large companies are a series of small groups that act as companies that have nearly trivial (to the point of absurdity) engineering concerns.

Sure, a few groups in each F500 need epic skills - but I think that's an exceedingly vanishing amount of the (actual) work that is being done. The term Enterprise and what it stands for earned it's laughable reputation for a reason.

staticassertion · on July 14, 2018

I've worked at smaller companies. Even still, it's not hard to hit relatively large amounts of data, depending on the field.

As an example, any kind of analytics could generate terabytes of data a day... per customer.

A side project I am building will have to handle billions of events per day. Per customer. There are 0 customers (this is for fun, not profit), but as soon as it would hit one customer I would need to consider an approach that scales.

How many companies have similar requirements?

But that's actually besides the point.

Microservice architecture, or any architecture that focuses on isolated, asynchronous components, adds complexity. Of course.

But it also reduces work in other areas. If you build async, isolated services, you no longer have to deal with catastrophic service failure. Cascading failures go away at the async bound.

For many of us, I imagine we've spent a lot of time fighting fires at organizations where one service going down was a serious problem, causing other services to fail, and setting your infrastructure ablaze. Hence a bias towards solving that problem upfront.

ljw1001 · on July 15, 2018

> As an example, any kind of analytics could generate terabytes of data a day... per customer.

Wait, what? I've never worked anywhere where one customer generated terabytes of data per day, and I've worked on very large commercial enterprise software.

The only thing I have experience with that produces anything close to that kind of data per customer is in genetic sequencing, and you only do a customer once. (Even that isn't a TB in its bulkiest, raw data form, and the formats used for cross-customer analysis are orders of magnitude smaller).

> For many of us, I imagine we've spent a lot of time fighting fires at organizations where one service going down was a serious problem, causing other services to fail, and setting your infrastructure ablaze.

The reason so many of us have worked in places like that is that those places 'got stuff done' and survived and grew.

jshmrsn · on July 17, 2018

I was also very confused by OP as well. I’m thinking maybe by customer he means a business using his analytics service. That would explain why it needs to scale for his first customer, and why one customer could have terabytes of data.

mrep · on July 14, 2018

My comment was definitely biased towards big tech companies where I have worked so not all of the fortune 500. That being said, I have worked on a 2 and 5 person team each managing hundreds of terabytes of data and both companies have tens of thousands of engineers.

Both companies are just are part of the large tech scene hence my skepticism about there not being many engineers that have to manage tons and tons of data/distributed systems as there are probably hundreds of thousands if not millions of engineers that have to think about these problems outside of the two companies I have worked for.

closeparen · on July 14, 2018

Building a system out of a handful of services, databases, caches, and queues is table stakes for a backend engineering intern, not "epic skills."

slashdev · on July 14, 2018

And knowing how to architect it so that it actually scales well is beyond most senior engineers. You underestimate the difficulty I think.