Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

One of the most important reasons to reinvent the wheel which is is not mentioned by the author is to avoid adding complexity through unnecessary dependencies.


100% this, and I'll add that libraries become popular because they solve an issue in many different scenarios.

That menas that almost by definition, if a library is popular, it contains huge amounts of code that just isn't relevant to your use case.

The tradeoff should be whether you can code your version quickly (assuming it's not a crypto library, never roll your own crypto), because if you can, you'll be more familiar with it and carry a smaller dependency.


“Never roll your own crypto” is just this year’s “never roll your own date library”. There will always be something. Could I code this? Even if it’s quick, there’s ongoing maintenance cost and you lose out on the FOSS community identifying and fixing vulnerabilities as well as new features that you may need to use. Yes, the library might be large and contain things you don’t need, but that’s the tradeoff. You can mitigate this (depending on the platform and language)—for example, with ESM tree-shaking.

I’d rather install date-fns or moment and let it decide what the fourth Sunday of a given month is in 2046, and also audit for the latest browser attack vectors.


I also agree to avoid adding complexity through unnecessary dependencies.

> if a library is popular, it contains huge amounts of code that just isn't relevant to your use case.

It is true that many libraries do contain such code, whether or not they have dependencies. For example, SQLite does not have any dependencies but does have code that is not necessarily relevant to your use. However, some programs (including SQLite) have conditional compilation; that sometimes helps, but in many cases it is not suitable, since it is still the same program and conditional compilation does not change it into an entirely different one which is more suitable for your use.

Also, I find often that programs include some features that I do not want and exclude many others, and existing programs may be difficult to change to do it. So that might be another reason to write my own, too.


Unfortunately, if you depend on any libraries, there's a decent chance one of them depends on some support library. Possibly for just one function. And then your build tool downloads the entire Internet.


A ruby application that I will soon need to use downloads 227 packages.


That's wild. Which gem is it? Something AWS?


OP said that was for an entire app, not dependencies for a single gem. And it’s not really that many. A bone-stock Rails app includes almost 120 gems out of the box. Add a few additional gems that each have their own dependencies and you can get up to over 200 total packages pretty quick.


> A bone-stock Rails app includes almost 120 gems out of the box.

Maybe it shouldn't be necessary to have 120 different external libraries just to run bone-stock app? They should be placed in standard library.


It is a command-line tool for document production.


That depends a lot on your language / build system. The easier it is to add a dependency, the more likely that is to be how they work, broadly speaking.


> almost by definition, if a library is popular, it contains huge amounts of code

popularity != bloat


It almost always means bloat though, because any library that’s not updated in the span of a year is considered “abandoned” and succumbs to feature creep.


"Never roll your own crypto" usually means "never devise your own crypto algorithms". Implementing an established algorithm yourself is OK provided you can prove your implementation works correctly. And... well, as Heartbleed showed, that's hard even with established crypto libraries.


Note that there are quite a few ways that crypto implementations can be insecure even if it's proven to be "correct" (in terms of inputs and outputs). For instance, it may leak information through timing, or by failing to clear sensitive memory due to a compiler optimization.


Getting the algorithm right is the easy part. It's the details of the implementation that kill you. Don't roll your own crypto.


That’s true for frameworks but not good libraries.


Frameworks are clearly worse, that's true. But there are also kitchen-sink libraries that are too shallow in relation to their API surface, or libraries that perform certain background work or modify some external global state in a way that is unnecessary for your use case, or libraries that pull in transitive dependencies of that sort. You really want to minimize the code that executes in order to fulfill your use case, and also minimize the temptation to depend on additional, tangential functions of a library that you wouldn’t have added for those functions alone.


Actually, now that we have decent LLMs, rolling your own crypto is probably feasible.


Any fool can write an encryption algorithm that he himself can't break. The NSA would greatly prefer that you did, too. Security is an arms race - you have to counter the latest generation of attackers.

It's okay to write a compiler or a database if you only know the basic principles, but it's not okay to write an encryption algorithm, or even an implementation of one, using only basic principles, because someone who knows more than you will break it.

For instance, were you aware that every time you write array[index], it leaks data to other threads in other processes on the same CPU, including JavaScript code running inside web browsers?


Yes of course, but do you know exactly who wrote your encryption libraries and what their qualifications are and who they work for or what their conflicts of interest might be?

I really doubt people give it even a second thought.


That holds not only for cryptography libraries, but generalizes to the entire computing stack. It's why, for example, coreboot exists, as well as various open source hardware projects. If it's fully open, you can inspect it yourself. Anywhere I see a branching statement within cryptography context, I'll know something's up.

The problems introduced in xz are still fresh, but Dual_EC_DRBG[0] also comes to mind within the cryptography context.

(Besides, getting cryptography right goes way beyond "just writing a library". As the parent commenter wrote, simple operations are the tip of the iceberg with regards to a correct implementation)

[0] https://en.wikipedia.org/wiki/Dual_EC_DRBG


If you’re not aggressively vetting the crypto libraries you’re using, you’re more or less exposing yourself to the same probability of risk as rolling your own crypto.


I'll pick my battles, it's part risk appetite and part expected attacker model. It also depends on what I'm trying to protect.

If it's in a standard library of an open source programming language, I'm less inclined to fully check the implementation.


If you're picking some random blob of code from Github, then yes. If you're picking OpenSSL or Bouncy Castle, then no. Despite Heartbleed.


When you use OpenSSL, you can trust that if you're screwed, so is the whole internet.

That actually happened once.


An underrated middle ground, at least when it comes to open source, is vendoring the dependency, cutting out the stuff you don't need, and adapting the API so that it's not introducing more complexity than it has to.

This is also generally helpful when you have performance requirements, as often 3rd party code even when optimized in general, isn't very well optimized for any particular use case.


I really like doing this.

I find that in many cases you can cut out 80 percent of the code of the original library.

Most of the deleted code is flexibility and features we don't need.

It's surprising how small the essence of a solution can be.


Another: getting good at invention / research is a skill that can be honed through practice - and you can practice on previously solved problems.


That's the main reason that I tend to "Reinvent the wheel."

Also, the dependencies often have a lot of extra "baggage," and I may only want a tiny bit of the functionality. Why should I use an 18-wheeler, when all I want to do, is drive to the corner store?

Also, and this is really all on me (it tends to be a bit of a minority stance, but it's mine), I tend to distrust opaque code.

If I do use a dependency, it's usually something that I could write, myself, if I wanted to devote the time, and something that I can audit, before integrating it.

I won't use opaque executables, unless I pay for it. If it's no-money-cost, I expect to be able to see the source.


I was on a code review where someone imported an entire library just to use the library's Pair<> class, which Java does not have.

But it is literally a one liner to declare a Pair<> type in java

``` record Pair<S, T>(S first, T second) {} ```


Custom solutions while initially potentially less complex gradually grow in complexity. There might be a time when you it's worth it to throw out your custom solution and replace it with more general dependency. There's still a benefit because dependency introduced at this stage is used way more thoughtfully because you know the problem it solves inside out.

It might also change your psychological relationship with the dependecy. Instead of being disugsted by yet another external dependecy bringing poorly understood complexity into your project you are thankful that there exists a piece of code maintained and tested by someone else that does the thing you know you need done and lets you remove whole mess of complexity you yourself constructed.


Yeah, I built a library to run tasks based on a directed a-cyclical graph (DAG) and each task can optionally belong to a queue.

So I had to write a simple queue, but since I wanted demos to work in the browser it has a IndexedDB backend, and I wanted demos it to work in an Electron app, so there is a SQLite backend, and I’ll likely want a multi-user server based one so there is a Postgres backend.

And I wanted to use it for rate limiting, etc, so limiters were needed.

And then there is the graph stuff, and the task stuff.

There are a lot of wheels to-create actually, if you don’t want any dependencies.

I do have a branch that uses TypeBox to make and validate the input and output json schemas for the tasks, so may not be dependency free for the core eventually.


Don't re-invent the wheel, use AirFlow :)


Not the grandparent, but Airflow is painfully slow and inefficient.

Our reinvented wheel using posgresql, rabbitmq and EC2 runners has ~10x better throughput and scales linearly with the number of pending tasks, whereas the airflow falls apart and fails to keep the runners fully occupied the moment you out any real load on it.


Does Airflow run in a browser?


If you mean it has an html based UI, yes.


+1, also sometimes to avoid complexity due to unnecessary abstractions/modularity, etc.


I'll agree with this, though in a lot of cases reinventing the wheel is a bad idea. A previous coworker insisted om writing everything instead of using libraries so I had to maintains a crap undocumented buggy version of what was available in a library.


Less "reinventing" the wheel and more "uncovering the wheel."


Reinventing the wheel is the real deal.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: