Hacker Newsnew | past | comments | ask | show | jobs | submit | ksclk's commentslogin

Does cron work (wake up the vm) there?

I assume it's cheaper to own the whole vertical slice at this scale, so you can control everything. Given that there's the financial incentive to do it, how would you prevent companies from growing vertically? If you declared a legal limit, how would you prevent a single entity from forming a chain of companies, effectively producing one huge vertical company as well?


> I assume it's cheaper to own the whole vertical slice at this scale, so you can control everything.

In general it's the opposite: Internal politics destroys value and a single point of failure is a business risk even if you own it because failure is rarely intentional.

As an example of the first, Kodak invented digital cameras but then failed to capitalize on them because it would have cannibalized their film business, and now their film business is dead anyway but so is the entire company. As an example of the second, Intel has vertically integrated fabs but now that their fabs are behind it's sinking the rest of the company. You could tell a similar story about AMD a decade and a half ago and spinning off their fabs is a big part of what saved them. IBM was also a big vertically integrated monster back in the day and they got out-competed by, well, everybody, and now they're a hollowed out consultancy.

The way out of this for a large conglomerate is to not take internal dependencies. So for example, Samsung makes both DRAM and devices, and they typically use their own DRAM in their own devices. But it's industry standard DRAM that they sell to anyone who is willing to pay them for it, and if Samsung's DRAM fabs all got destroyed by a natural disaster or their technology fell behind for some reason, their device units could immediately switch to a competitor until their DRAM unit got their house back in order. Likewise, if their consumer devices became uncompetitive their DRAM unit could still sell to the rest of the market because they're not fully beholden to a single internal customer. And having that serves as a canary; Intel didn't have external fab customers so it didn't notice them switching to TSMC, which would otherwise have been a red flag.

The "problem" is that you need to have some foresight. Everything's great until it isn't. If a company waits until one of the internal units has a problem before realizing that it's a single point of failure for other business units, it's too late to redesign the ship after you've already hit the iceberg.


By enforcing antitrust laws, like it has been done many times in history?


> the idea of creating black boxes that you can't misuse

Could you please expand upon your idea, particularly the idea that creating (from what I understood) a hierarchical structure of "blackboxes" (abstractions) is bad, and perhaps provide some examples? As far as I understand, the idea that you compose lower level bricks (e.g. classes or functions that encapsulate some lower level logic and data, whether it's technical details or business stuff) into higher level bricks, was what I was taught to be a fundamental idea in software development that helps manage complexity.

> structure things as a loosely coupled set of smaller components

Mind elaborating upon this as well, pretty please?


> Could you please expand upon your idea that [..] a hierarchical structure of "blackboxes" [...] is bad?

You'll notice yourself when you try to actually apply this idea in practice. But a possible analogy is: How many tall buildings are around your place, what was their cost, how groundbreaking are they? Chances are, most buildings around you are quite low. Low buildings have a higher overhead in space cost, so especially in denser cities, there is a force to make buildings with more levels.

But after some levels, there are diminishing returns from going even higher, compared to just creating an additional building of the same size. And overhead is increasing. Higher up levels are more costly to construct, and they require a better foundation. We can see that most higher buildings are quite boring: how to construct them is well-understood, there isn't much novelty. There just aren't that many types of buildings that have all these properties: 1) tall/many levels 2) low overall cost of creation and maintenance 3) practical 4) novel.

With software components it's similar. There are a couple of ideas that work well enough such that you can stack them on top of each other (say, CPU code on top of CPUs on top of silicon, userspace I/O on top of filesystems on top of hard drives, TCP sockets on top of network adapters...) which allows you to make things that are well enough understand and robust enough and it's really economical to scale out on top of them.

But also, there isn't much novelty in these abstractions. Don't underestimate the cost in creating a new CPU or a new OS, or new software components, and maintaining them!

When you create your own software abstractions, those just aren't going to be that useful, they are not going to be rock-solid and well tested. They aren't even going to be that stable -- soon a stakeholder might change requirements and you will have to change that component.

So, in software development, it's not like you come up with rock-solid abstractions and combine 5 of those to create something new that solves all your business needs and is understandable and maintainable. The opposite is the case. The general, pre-made things don't quite fit your specific problem. Their intention was not focused to a specific goal. The more of them you combine, the less the solution fits and the less understandable it is and the more junk it contains. Also, combining is not free. You have to add a _lot_ of glue to even make it barely work. The glue itself is a liability.

But OOP, as I take it, is exactly that idea. That you're creating lots of perfect objects with a clear and defined purpose, and a perfect implementation. And you combine them to implement the functional requirements, even though each individual component knows only a small part of them, and is ideally reusable in your next project!

And this idea doesn't work out in practice. When trying to do it that way, we only pretend to abstract, we just pretend to reuse, and in the process we add a lot of unnecessary junk (each object/class has a tendency to be individually perfected and to be extended, often for imaginary requirements). And we add lots of glue and adapters, so the objects can even work together. All this junk makes everything harder and more costly to create.

> structure things as a loosely coupled set of smaller components

Don't build on top of shoddy abstractions. Understand what you _have_ to depend on, and understand the limitations of that. Build as "flat" as possible i.e. don't depend on things you don't understand.


Thanks a ton! While I don't have the experience to understand all of it, I appreciate your writing, like the sibling poster (and that you didn't delete your comment)!

It reminds me of huge enterprise-y tools, which in the long run often are more trouble than they're worth (and reimplementing just the subset you need perhaps would be better), and (the way you speak about OOP) bloated "enterprise" codebases with huge classes and tons of patterns, where I agree making things leaner and less generic would do a lot of good.

At first however I thought that you're against the idea of managing complexity by hierarchically splitting things into components (i.e. basically encapsulation), which is why I asked for clarification, because this idea seems fundamental to me, and seeing that someone is against it got me interested. I think now though that you're not against this idea, and you're against having overly generic abstractions (components? I'm not sure if I'm using the word "abstractions" correctly here) in your stack, because they're harder to understand, which I understand. I assume this is what blackbox means here.

Does it sound correct?


I'm not at all about decomposition and encapsulation. But I do think that the idea of _hierarchical_ decomposition can easily be overdone. The hierarchy idea might be what leads to building "on top" of leaky abstractions.


> When you create your own software abstractions, those just aren't going to be that useful, they are not going to be rock-solid and well tested. They aren't even going to be that stable -- soon a stakeholder might change requirements and you will have to change that component.

I also think it's about how many people you can get to buy-in on an abstraction. There probably are better ways of doing things than the unix-y way of having an OS, but so much stuff is built with the assumption of a unix-y interface that we just stick with it.

Like why can't I just write a string of text at offset 0x4100000 on my SSD? You could but a file abstraction is a more manageable way of doing it. But there are other manageable ways of doing it right? Why can't I just access my SSD contents like it's one big database? That would work too right? Yeah but we already have the file abstraction.

>But OOP, as I take it, is exactly that idea. That you're creating lots of perfect objects with a clear and defined purpose, and a perfect implementation. And you combine them to implement the functional requirements, even though each individual component knows only a small part of them, and is ideally reusable in your next project!

I think OOP makes sense when you constrain it to a single software component with well defined inputs and outputs. Like I'm sure many GoF-type patterns were used in implementing many STL components in C++. But you don't need to care about what patterns were used to implement anything in <algorithm> or <vector>. you just use these as components to build a larger component. When you don't have well defined components that just plug and play over the same software bus, no matter how good you are in design patterns it's gonna eventually turn into spagetti un-understandable mess.

I'm really liking your writing style by the way, do you have a blog or something?


I think I agree with your "buy-in idea", but adding that the Unix filesystem abstraction is almost as minimal as it gets, at least I'm not aware of a simpler approach in existence. Maybe subtract a couple small details that might have turned out as not optimal or useful. You can also in fact write a string to an offset on an SSD (open e.g. /dev/sda), you only need the necessary privileges (like for a file in a filesystem hierarchy too btw).

A database would not work as mostly unstructured storage for uncoordinated processes. Databases are quite opinionated and require global maintenance and control, while filesystems are less obtrusive, they implement the idea of resource multiplexing using a hierarchy of names/paths. The hierarchy lets unrelated processes mostly coexist peacefully, while also allowing cooperation very easily. It's not perfect, it has some semantically awkward corner cases, but if all you need is multiplexing a set of byte-ranges onto a physical disk, then filesystems are a quite minimal and successful abstraction.

Regarding STL containers, I think they're useful and useable after a little bit of practice. They allow you to get something up and running quickly. But they're not without drawbacks and at some point it can definitely be worthwhile to implement custom versions that are more straightforward, more performant (avoiding allocation for example), have better debug performance, have less line noise in their error messages, and so on. The most important containers in the STL are quite easy to implement custom versions with fewer bells and whistles for. Maybe with the exception of map/red-black tree which is not that easy to implement and sometimes the right thing to use.


> I'm really liking your writing style by the way, do you have a blog or something?

Thank you! I don't get to hear that often. I have to say I was almost going to delete that above comment because it's too long, the structure and build up is less than clear, there are a lot of "just" words in it and I couldn't edit anymore. I do invest a lot of time trying to write comments that make sense, but have never seen myself as a clear thinker or a good writer. To answer your question, earlier attempts to start a blog didn't go anywhere really... Your comment is encouraging though, so thanks again!


Hi! Could you please tell me what use cases would nginx be better for, outside of serving static files?

Moreover, having worked with Django a bit (I certainly don't have as much experience as you do), it seems to me that anything that benefits from asynchrony and is trivial in Node is indeed a pain in Django. Good observability is much harder to achieve (tools generally support Node and its asynchrony out of the box, async python not so much), Celery is decent for long running, background, or fire and forget tasks, but e.g. using it to do some quick parallel work, that'd be a simple Promise.all() is much less performant (serialize your args, put it in redis, wait for a worker to pick it up, etc), doing anything that blocks a thread for a little bit, whether in Django or Celery,is a problem, because you've got a very finite amount of threads (unless you use gevent, which patches stdlib, which is a huge smell in itself), and it's easy to run out of them... Sure, you can work around anything, but with Node you don't have to think about any of this, it just works.

When you're still small, isn't taking a week to move to Node a better choice than first evaluating a solution to each problem, implementing solutions, each of which can be more or less smelly (which is something each of your engs will have to learn and maintain... We use celery for this, nginx for that, also gevent here because yada yada, etc etc), which in total might take more days and put a much bigger strain on you in the long term? Whereas with Node, you spend a week, and it all just works in a standard way that everyone understands. It seems to me that exploring other options first would indeed be a better choice, but for a bigger project, not when the rewrite is that small.

Thank you for your answers!


> you can already use a SAT solver

Could you elaborate please? How would you approach this problem, using a SAT solver? All I know is that a SAT solver tells you whether a certain formula of ANDs and ORs is true. I don't know how it could be useful in this case.


Pretty much all instructions at the assembly level are sequences of AND/OR/XOR operations.

SAT solvers can prove that some (shorter) sequences are equivalent to other (longer) sequences. But it takes a brute force search.

IIRC, these super optimizing SAT solvers can see patterns and pick 'Multiply' instructions as part of their search. So it's more than traditional SAT. But it's still... At the end of the day.... A SAT equivalence problem.


A short look at any compiled code on godbolt will very quickly inform you that pretty much all instructions at the assembly level are, in fact, NOT sequences of AND/OR/XOR operations.


All instructions are implemented with logic gates. In fact. All instructions today are likely NAND gates.

Have you ever seen a WallaceTree multiplier? A good sequence that shows how XOR and AND gates can implement multiply.

Now, if multiply + XOR gets the new function you want, it's likely better than whatever the original compiler output.


That was not the claim. The claim was that assembler was made up out of sequences of OR/AND/XOR, and that claim is demonstrably false.


"Made up of" isn't helpful. All assembly languages are equivalent to a sequence of bit operations. (…if you ignore side effects and memory operations.)

In fact there's several different single operations you can build them all out of: https://en.wikipedia.org/wiki/One-instruction_set_computer#I...

So you take your assembly instructions, write a sufficiently good model of assembly instructions<>bit operations, write a cost model (byte size of the assembly works as a cheap one), and then search for assembly instructions that perform the equivalent operation and minimize the cost model.

Like here: https://theory.stanford.edu/~aiken/publications/papers/asplo...


You're missing forest for the trees and lacking information to discuss this really. Check out this paper and similar ones if you want to learn about this area: https://arxiv.org/pdf/1711.04422


What do you think implements assembly language?

Answer: Verilog or VHDL. And these all synthesize down to AND/OR/XOR gates and eventually converted into NAND gates only.

Every assembly language statement is either data movement, or logic, or some combination of the two.

-------

We are talking about SAT solvers and superoptimizers. Are you at all familiar with this domain? Or have you even done a basic search on what the subject matter is?


Computers are not programmed on the level of logic gates. If you want to do that, design an FPGA.


Superoptimizers take this stuff into consideration.

Which is what the parent post was talking about, the rare superoptimizer.


If you want something optimised… "equivalent" isn't going to do it.


You're missing the point. All instructions can be simplified to short integer operations, then all integer operations are just networks of gates, then all gates can be replaced with AND/OR/NOT, or even just NAND. That's why you can SAT solve program equivalence. See SMT2 programs using BV theory for example.

Also of course all instructions are MOV anyway. https://github.com/xoreaxeaxeax/movfuscator


I seem to remember that program equivalence is an NP-hard problem. I very much doubt that you can solve it by reducing it to logic gates.


NP-hard is just a complexity class, not a minimal threshold. SAT solving is kind of a guided optimistic bruteforcing. If the example is small enough, you can still solve it very quickly. Same as you can solve travelling salesman for a small number of cities. We solve small cases of NP-hard things all over the place.


(not to confirm it's an np-hard problem - it may not even be decidable generally, but in practice yes, you can check it that way and SMT2 theories provide the right ops)


That's why superoptimizers work on short sequences of code and take a long time doing so.


I see a guy who has never seen assembly explain assembly to people who have written assembly and written compilers and optimisations…


What you're saying is true. Yes you can grind away at generating sequences of instructions, SAT solve equivalence and benchmark, but of course, you would sooner see all black holes in the observable universe evaporate before you find an instruction sequence that is both correct AND 100x faster.


You're a bit naive about the complexity. Commonly longer sequences are actually faster, not just because instructions vary in their speed, but also because the presence of earlier instructions that don't feed results into later instructions still affect their performance. Different instructions consume different CPU resources and can contend for them (e.g. the CPU can stall even though all the inputs needed for a calculation are ready just because you've done too many of that operation recently). And then keep in mind when I say "earlier instructions" I don't mean earlier in the textual list, I mean in the history of instructions actually executed; you can reach the same instruction arriving from many different paths!


Hmm, this usually doesn't come up simply because you're usually targeting multiple different CPU generations at once, and then the details cancel each other out.

The most ffmpeg has had to do in this area is that some CPUs had very slow unaligned memory loads and some didn't.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: