In the short term sure it was a good payout. I guess the main flaw in the criticisms is overestimating how quickly big tech would catch up. It did happen but there was enough time to sell for a large sum first.
I wonder how long it'll take (if it hasn't already) until the messaging around this inevitably moves on to "Do not self-host this, are you crazy? This requires console commands, don't be silly! Our team of industry-veteran security professionals works on your digital safety 24/7, you would never be able to keep up with the demands of today's cybersecurity attack spectrum. Any sane person would host their claw with us!"
Next flood of (likely heavily YC-backed) Clawbase (Coinbase but for Claws) hosting startups incoming?
What exactly are they self hosting here? Probably not the model, right? So just the harness?
That does sound like the worst of both worlds: You get the dependency and data protection issues of a cloud solution, but you also have to maintain a home server to keep the agent running on?
You have spend tens of thousands of dollars on hardware to approach the reasoning and tool call levels of SOTA models...so, casually mentioning "just use local LLM" is out of reach for the common man.
That's pretty much how it was in the 90s with computer tech. 10 years later we were watching cat videos on machines that dwarfed the computing power of what used to be servers.
That ship has sailed a long time ago. It's of course possible, if you are willing to invest a few thousand dollars extra for the graphics card rig + pay for power.
> but you also have to maintain a home server to keep the agent running on
I'm not fascinated by the idea that a lot of people here don't have multiple Mac minis or minisforum or beelink systems running at home. That's been a constant I've seen in tech since the 90s.
I already built an operator so we can deploy nanoclaw agents in kubernetes with basically a single yaml file. We're already running two of them in production (PR reviews and ticket triaging)
1. Another AI agent (actually bunch of folks in a 3rd-world country) to gatekeep/check select input/outputs for data leaks.
2. Using advanced network isolation techniques (read: bunch of iptables rules and security groups) to limit possible data exfiltration.
This would actually be nice, as the agent for whatsapp would run in a separate entity with limited network access to only whatsapp's IP ranges...
3. Advanced orchestration engine (read: crontab & bunch of shell scripts) that are provided as 1st-party components to automate day-to-day stuff.
Possibly like IFTTT/Zapier/etc. like integration, where you drag/drop objectives/tasks in a *declarative* format and the agent(s) figure out the rest...
I run Qwen3-Coder-Next (Qwen3-Coder-Next-UD-Q4_K_XL) on the Framework ITX board (Max+ 395 - 128GB) custom build. Avg. eval at 200-300 t/s and output at 35-40 t/s running with llama.cpp using rocm. Prefer Claude Code for cli.
I can only recommend giving headscale a try. It's free, works extremely well, and can be used with the official Tailscale clients. Was super easy to set up.
Could you give a brief description of your use case? I'm looking at all the tailscale buzzwords on their site, but am not really understanding what I would use this for in my home setup
Not sure about the parent, but here's what I use it for:
A) easy access my other, older machines from my phone or work laptop to:
- self-host a Coolify server (a "vercel-lite" control panel)
- remote connect to my older laptop to run tests/longer coding tasks for work (e.g. large browser test suites, sandboxed claude running in bg to answer longer code questions, or build fire and forget spikes/experiments)
- control my home cinema remotely (remote+ app bc it's easy and Remote Desktop).
- use w. Mullvad VPN as an exit note (Tailscale has a really easy UI for it nowadays)
B) use it like ngrok to expose my dev servers to the internet (e.g. when sharing a quick demo/pairing with a coworker)
C) cheap NAS - I the old mac is connected to an external HD (the HD itself is archived to Hetzner)
I haven't (yet) tested it as an alternative to Hamachi (is it still a thing?) but I'm planing a LAN party with my brothers who live across the continent.
Like you, I also didn't know what the fuss was about, and I'm generally cautious not to get sidetracked.
I run it on all my vps and allow me to close every port but 80 and 443, even port 22 is closed
I ssh through the tailnet network without worrying about remembering ips because of how their magicdns works
I have deployed some admin dashboards and it simplifies the security a lot because I don't have to worry about them being exposed to the internet, I can directly connect to them using http://my-vps:port on any device connected to the tailnet
I sometimes also use my vps as an exit node whenever I need a vpn
I know this might sound like a commercial but it is not, it's one of those pieces of tech that has really changed how I work since I discovered it and I can't do other thing than recommend it
That said, their free tier is more than enough for me, and if they haven't one I probably wouldn't pay for this and just find an open source alternative
I haven't checked headscale in depth but seems promising, will give it a try
I have some servers sending their telegraf data to a server in my home using the tailnet instead of opening a port on my firewall for that, to name one use case.
It has a pretty good ACL functionality, you can configure which hosts with certain tag can access certain routes.
yeah the amount of nodes I had on the public internet, when all I really needed was some internal connectivity (exactly like you have here, a machine sending logs to an internal-only loki instance, and then a grafana node that is also only internally relevant and never needs to see the public internet), etc.
I have one VPS node that I use as a connector, where the headscale app is installed. I have this on a domain (for convenience), so think something like:
hs.mygreatplace.com
Now, when I install Tailscale client on any device (phones, tablets, Linux machines, proxmox nodes, etc.), I simply say: don't use the tailscale network for this, please route this over my own network, so you point it to hs.mygreatplace.com as a connectivity server, which is compatible to Tailscale, and that's it. It's officially supported by Tailscale, so that's great and makes it all work.
Then, when pairing for the first time, you'll get a link/code, click it and/or enter it on the hub basically (hs.mygreatplace.com) and it's paired.
That connection is up and will stay up now. So while that new device may be behind a firewall, I can always connect to it. You open Tailscale and see all your paired devices. They basically now get an additional internal ip (100.0.0.1, etc.) and you use that to ssh or connect to it.
I have a beefy Proxmox machine, and used to route many of these services out to the public internet through port mapping, but now I just leave them cut off entirely and only surface them inside of my private network. When connecting to these nodes (from iPhone, Laptops, etc.), there's zero configuration once it is set up, it auto-routes correctly and just acts like those nodes are on the internet, it's a dream.
It also automatically adds the node as a subdomain, so if you pair a proxmox node that runs grafana, and maybe has a hostname "grafana", it will show up and be always reachable as: grafana.hs.mygreatplace.com
It doesn't get much easier than that.
All that said, I HIGHLY recommend Tailscale for anyone who hasn't done much with private networking, just to try out first, and get used to it. Their free tier is very generous and I think they've got a fantastic next-to-zero-config product, truly wonderful. However, my concern was to be trapped with a $160m dollar VC-funded (US-based) company, when the inevitable rug gets pulled (as it always does, and as anyone should come to accept, if you've been on the internet for a minute).
So I was looking for alternatives, and headscale immediately worked out. Of course, Tailscale ever killing their client's ability to use your own infra will lead to a similar end result (dead end), but I am sure those things can eventually be sorted out by open source attempts and clients (which headscale has, I just haven't tried them out yet, https://headscale.net/0.25.0/about/clients/).
I had a Wireguard network before (which this essentially also is, but in a much nicer packaging), but always ran into config problems with the shared profiles and IPs and so forth, so this was just a simpler step.
well the OP talks about headscale server (self-host) which will run whereever your server that you install it onto will be. You just use the tailscale clients.
Headscale is good. We're using to manage a two isolated networks of about 400 devices each. It just works. It's in China so official Tailscale DERPs do not work, but enabling built-in DERP was very easy.
headscale is an awesome project. And I love tailscale as a product.
But this is where netbird beats tailscale: coordinator server open sourced out/self hosted out the gate.
Headscale is currently maintained by a few tailscale employees on their spare time. Currently, Tailscale allows this to happen but clearly there’s some internal management of what gets downstreamed to headscale.
What I don’t like about headscale is that you can only host a single coordinator server as well. If I need to do maintenance on the server, it means an impact to the tailnet. It’s rare but annoying.
> What I don’t like about headscale is that you can only host a single coordinator server as well. If I need to do maintenance on the server, it means an impact to the tailnet. It’s rare but annoying.
Any p2p connections should keep working for some time even if the coordinator goes down... right?
Headscale mostly works pretty well but its pretty finicky to get set up in a way where the tailscale clients on linux and android aren't always complaining with warnings or having route or DNS issues. I'm considering investigating one of these non commericial solutions where the entire stack was built to work together.
Apparently they've deprecated Postgres support and now only recommend sqlite as the storage backend. I have nothing against sqlite but to me this looks like Tailscale actively signaling what they think the expected use of headscale is.
> Scaling / How many clients does Headscale support?
> It depends. As often stated, Headscale is not enterprise software and our focus is homelabbers and self-hosters. Of course, we do not prevent people from using it in a commercial/professional setting and often get questions about scaling.
> Please note that when Headscale is developed, performance is not part of the consideration as the main audience is considered to be users with a modest amount of devices. We focus on correctness and feature parity with Tailscale SaaS over time. [...]
> Headscale calculates a map of all nodes that need to talk to each other, creating this "world map" requires a lot of CPU time. When an event that requires changes to this map happens, the whole "world" is recalculated, and a new "world map" is created for every node in the network. [...]
> Headscale will start to struggle when [there are] e.g. many nodes with frequent changes will cause the resource usage to remain constantly high. In the worst case scenario, the queue of nodes waiting for their map will grow to a point where Headscale never will be able to catch up, and nodes will never learn about the current state of the world.
I find that quite interesting and it is one of the reasons I've not really considered trying out Headscale myself.
Why? Makes perfect sense to me. Designing a product with a specific use case in mind is good. When you've got the limited resources of am open source volunteer project, trying to solve every problem is a recipe for burnout. If it can even be done.
I mean this is a great advertisement in and of itself. Something being considered "enterprise software" means it will have 90% more features than needed, the code will be a combination of dozens of different mid-level devs new perfect abstractions and will only test code paths through all those features that the original enterprise valued. I.E. it is great if you work in an enterprise as it will generate a lot of work with an easy scapegoat.
I dont understand what these two have to do with anything? The db-use is almost trivial, and SQLite can be embedded. Why would we want wasted effort and configuration complexity on supporting postgres?
With that kind of logic you wouldn't need headscale and would just ask your favorite LLM to write a similar tool for your with your own requirements and nothing else.
No, not really necessary to extrapolate the logic any further. You have deemed a very specific and focused task as "wasted effort." So the logic leads to putting in the effort you do not find "wasteful" and outsource the remainder to the LLM do this very specific thing.
TIL! My problem with them requiring sqlite was that I assumed it would make a high availability setup either hard or impossible. Maybe that's not true, but definitely off the beaten path for headscale.
Yeah, Headscale people don't hide that it's a toy. I didn't get a homelab full of datacentre-grade equipment because I want to use toy, nonscaling solutions with vastly incomplete feature sets, but for the exact opposite reason.
On a different note; the HN obsession with SQLite these days is getting a bit tiresome.
I started with a docker container that connected to both the VPN provider and tailscale. Now OPNSense is handing a few connections to the VPN provider at a couple locations around the world, and enforcing external traffic to be routed to the VPN connections via VLAN tags (untagged has direct internet access).
Using the VPN provider can either be adding a VLAN tag to a machine/container or connecting to a "vpn-{location}" tailscale exit node.
I'd say no, but it really depends on what your use is. The biggest barrier is that it doesn't have a HA story that I'm aware of, but you might be able to get one by carefully replicating the sqlite and using something like pacemaker to fail over and fail back.
That said, I've been using headscale on 220 devices for ~3.5 years now and it's been quite reliable.
yeah looks like someone is either a hyper tailscale fan or had extremely bad experience with it, I also run several dozens of machines (and tablets and phones) on it. never had a single moment of downtime since I started.
So instead of opening a port on my firewall for WireGuard, I must have these ports public exposed:
* tcp/80
* tcp/443
* udp/3478
* tcp/50443
I don't know about you but that seems the most insane approach.
Even if HTTP-01 challenge is not used, you are still exposing 3 ports instead of 1 random-high port like 55555 for example.
Yeah yeah, you can use rever-proxy but still, you are exposing way more ports and services to the internet than just one port for WireGuard itself.
- TCP/80 is only required to answer let’s encrypt challenges for certificate issuance
- UDP is only required to enable DERP.
These are both optional.
It’s not surprising that there are additional ports required on top of Wireguard. 443 is likely for key distribution and management. If you don’t want PKI then you don’t need headscale; you can always distribute the keys yourself and just run plain wireguard
>If you don’t want PKI then you don’t need headscale; you can always distribute the keys yourself and just run plain wireguard
It makes more sense to me, WireGuard + SPA (fkwnop aka replacement of port knocking that requires pre-shared key to even talk with, only that IP can access to it (IP Table), any scan tool seems it as closed)
Headscale/Tailscale only has value if you are behind a CGNAT, otherwise, it just adds extra management and complexities.
Well, it also lets you federate access and manages the keys for you. But yeah, if it’s a personal setup and you have good key rotation hygiene, I agree with you: it doesn’t add much value on top of wireguard. I’ll hazard a guess that you can just run your own DERP relay too for the CGNAT case.
80/443 is all that's necessary for Headscale as a control server.
UDP/3478 is STUN for the embedded DERP. I recommend hosting a distinct DERP server, thus decoupling the control and data planes. DERPer is open source from Tailscale.
50443 is for GRPC. I'd not expose that, even if it is protected by authentication (and tested).
at the moment, I think the best you can do is qwen3-coder:30b -- it works, and it's nice to get some fully-local llm coding up and running, but you'll quickly realize that you've long tasted the sweet forbidden nectar that is hosted llms. unfortunately.
Just be super careful to understand what CPU time means before you go ahead and waste time on this. They don't immediately flag this, but once you go past a very, very very small time threshold on used CPU time, they'll start aborting requests. This does not happen (as harshly) on fully paid accounts, of course.
But since many are comfortably being dragged into the Cloudflare vortex through their otherwise generously free offers, you'll find that the Cloudflare Worker CPU time limitation can turn into a huge waste of time, after the fact, once you realize the worker code you converted a few days ago and you're all joyful about suddenly starts failing a few days later.
Addendum: Just to illustrate the moment where you'll trip over it: here it casually mentions the default minimum being 30s, without being clear that this *only* applies to paid accounts. Only further down somewhere there's a tiny mention of 10ms!
https://developers.cloudflare.com/workers/platform/limits/#c...
So, if your script can get by with a max of 10 milliseconds of CPU time per invocation (not runtime), you'll be fine. You will, however, and this is crucial, only realize this a few days in. They're taking the average and eventually cap you and it stops responding.
I've found the memory limits to block more of my projects than cpu time. They seem to send multiple requests to a single node/process and if you're making some sort of remix app it easily breaks with any kind of load.
Self-hosting is more a question of responsibility I'd say. I am running a couple of SaaS products and self-host at much better performance at a fraction of the cost of running this on AWS. It's amazing and it works perfectly fine.
For client projects, however, I always try and sell them on paying the AWS fees, simply because it shifts the responsibility of the hardware being "up" to someone else. It does not inherently solve the downtime problem, but it allows me to say, "we'll have to wait until they've sorted this out, Ikea and Disney are down, too."
Doesn't always work like that and isn't always a tried-and-true excuse, but generally lets me sleep much better at night.
With limited budgets, however, it's hard to accept the cost of RDS (and we're talking with at least one staging environment) when comparing it to a very tight 3-node Galera cluster running on Hetzner at barely a couple of bucks a month.
Or Cloudflare, titan at the front, being down again today and the past two days (intermittently) after also being down a few weeks ago and earlier this year as well. Also had SQS queues time out several times this week, they picked up again shortly, but it's not like those things ...never happen on managed environments. They happen quite a bit.
Over 20 year I've had lots of clients on self-hosted, even self-hosting SQL on the same VM as the webserver as you used to in the long distant past for low-usage web apps.
I have never, ever, ever had a SQL box go down. I've had a web server go down once. I had someone who probably shouldn't have had access to a server accidentally turn one off once.
The only major outage I've had (2/3 hours) was when the box was also self-hosting an email server and I accidentally caused it to flood itself with failed delivery notices with a deploy.
I may have cried a little in frustration and panic but it got fixed in the end.
I actually find using cloud hosted SQL in some ways harder and more complicated because it's such a confusing mess of cost and what you're actually getting. The only big complication is setting up backups, and that's a one-off task.
As someone who has set this up while not being a DBA or sysadmin.
Replication and backups really aren’t that difficult to setup properly with something like Postgres. You can also expose metrics around this to setup alerting if replication lag goes beyond a threshold you set or a backup didn’t complete. You do need to periodically test your backups but that is also good practice.
I am not saying something like RDS doesn’t have value but you are paying a huge premium for it. Once you get to more steady state owning your database totally makes sense. A cluster of $10-20 VPSes with NVMe drives can get really good performance and will take you a lot farther than you might expect.
I think the pricing of the big three is absurd, so I'm on your side in principle. However, it's the steady state that worries me. When the box has been running for 4 years and nobody who works there has any (recent) experience operating postgres anymore. That shit makes me nervous.
More than that, it's easier than it ever was to setup but we live in the post-truth world where nobody wants to own their shit (both figuratively and concretely) ...
datasette and datasette-lite (WASM w/pyodide) are web UIs for SQLite with sqlite-utils.
For read only applications, it's possible to host datasette-lite and the SQLite database as static files on a redundant CDN. Datasette-lite + URL redirect API + litestream would probably work well, maybe with read-write; though also electric-sql has a sync engine (with optional partial replication) too, and there's PGlite (Postgres in WebAssembly)
So can the cloud, and cloud has had more major outages in the last 3 months than I've seen on self-hosted in 20 years.
Deploys these days take minutes so what's the problem if a disk does go bad? You lose at most a day of data if you go with the 'standard' overnight backups, and if it's mission critical, you will have already set up replicas, which again is pretty trivial and only slightly more complicated than doing it on cloud hosts.
> ...you will have already set up replicas, which again is pretty trivial and only slightly more complicated than doing it on cloud hosts.
Even on PostgreSQL 18 I wouldn't describe self hosted replication as "pretty trivial". On RDS you can get an HA replica (or cluster) by clicking a radio box.
For this kind of small scale setup, a reasonable backup strategy is all you need for that. The one critical part is that you actually verify your backups are done and work.
Hardware doesn't fail that often. A single server will easily run many years without any issues, if you are not unlucky. And many smaller setups can tolerate the downtime to rent a new server or VM and restore from backup.
Not as often as you might think. Hardware doesn’t fail like it used to.
Hardware also monitors itself reasonably well because the hosting providers use it.
It’s trivial to run a mirrored containers on two separate proxmox nodes because hosting providers use the same kind of stuff.
Offsite backups and replication? Also point and click and trivial with tools like Proxmox.
RAID is actually trivial to setup.l if you don’t compare it to doing it manually yourself from the command line. Again, tools like Proxmox make it point and click and 5 minutes of watching from YouTube.
If you want to find a solution our brain will find it. If we don’t we can find reasons not to.
One thing that will always stick in my mind is one time I worked at a national Internet service provider.
The log disk was full or something. That's not the shameful part though. What followed is a mass email saying everyone needs to update their connection string from bla bla bla 1 dot foo dot bar to bla bla bla 2 dot foo dot bar
This was inexcusable to me. I mean this is an Internet service provider. If we can't even figure out DNS, we should shut down the whole business and go home.
Yes, you are correct. But actually, I am not claiming someone claimed it :) I am actually trying to get at the idea, that the "business people" usually bring up, that they are looking after the user's/customer's interest and that others don't have the "business mind", while actually when it comes to this kind of decision making, all of that is out of the window, because they want to shift the blame.
A few steps further stepped back, most of the services we use are not that essential, that we cannot bear them being down a couple of hours over the course of a year. We have seen that over and over again with Cloudflare and AWS outages. The world continues to revolve. If we were a bit more reasonable with our expectations and realistic when it comes to required uptime guarantees, there wouldn't be much worry about something being down every now and then, and we wouldn't need to worry about our livelihood, if we need to reboot a customer's database server once a year, or their impression about the quality of system we built, if such a thing happens.
But even that is unlikely, if we set up things properly. I have worked in a company where we self-hosted our platform and it didn't have the most complex fail-safe setup ever. Just have good backups and make sure you can restore, and 95% of the worries go away, for such non-essential products, and outages were less often than trouble with AWS or Cloudflare.
It seems that either way, you need people who know what they are doing, whether you self-host or buy some service.
That's more a small business owner perspective. For a middle manager rattling some cages during a week of IBM downtime is adequate performance while it is unclear how much performative response is necessary if mom&pops is down for a day.
You have to consider the class of problems as a whole, from the perspective of management:
- The cheap solution would be equally good, and it's just a blame shifting game.
- The cheap solution is worse, and paying more for the name brand gets you more reliability.
There are many situations that fall into the second category, and anyone running a business probably has personal memories of making the second mistake. The problem is, if you're not up to speed on the nitty gritty technical details of a tradeoff, you can't tell the difference between the first category and the second. So you accept that sometimes you will over-spend for "no reason" as a cost of doing business. (But the reason is that information and trust don't come for free.)
It's also better for the technical people. If you self host the DB goes down at 2am on a Sunday morning all the technical people are gonna get woken up and they will be working on it until it's fixed.
If us-east goes down a technical person will be woken up, they'll check downdetector.com, and they'll say "us-east is down, nothin' we can do" and go back to sleep.
Just wait until you end up spending $100,000 for an awful implantation from a partner who pretends to understand your business need but delivers something that doesn’t work.
But perhaps I’m bitter from prior Salesforce experiences.
This is a major reason the cloud commands such a premium. It’s a way to make down time someone else’s problem.
The other factor is eliminating the “one guy who knows X” problem in IT. What happens if that person leaves or you have to let them go? But with managed infrastructure there’s a pool of people who know how to write terraform or click buttons and manage it and those are more interchangeable than someone’s DIY deployment. Worst case the cloud provider might sell you premium support and help. Might be expensive but you’re not down.
Lastly, there’s been an exodus of talent from IT. The problem is that anyone really good can become a coder and make more. So finding IT people at a reasonable cost who know how to really troubleshoot and root cause stuff and engineer good systems is very hard. The good ones command more of a programmer salary which makes the gap with cloud costs much smaller. Might as well just go managed cloud.
I never understood the argument of a senior IT person's salary competing for the cloud expenses. In my contracting and consulting career I have done all of programming, monitoring and DevOps many times; the cost of my contract is amortized over multiple activities.
The way you present it makes sense of course. But I have to wonder whether there really are such clear demarcation lines between responsibilities. At least over the course of my career this was very rarely the case.
That is called "bus factor" or "lottery factor". If the one IT guy gets hit by a bus or wins the lottery and quits, what happens? You want a bus factor of two or more - Two people would have to get hit by a bus for the company to have a big problem
There's a bus factor equivalent with the cloud, too. The power to severely disrupt your service (either accidentally, or on purpose) rests with a single org (and often, a single compliance department within that org).
Ironically, this becomes more of a concern the larger the supplier. AWS can live with firing any one of their customers - a smaller outfit probably couldn't.
Many people do inaccurately equate IaC with “cloud native” or cloud “only”.
It can certainly fit into a particular cloud platform’s offerings. But it’s by no means exclusive to the cloud.
My entire stack can be picked up and redeployed anywhere where I can run Ubuntu or Debian. My “most external” dependencies are domain name registries and an S3-API compatible object store, and even that one is technically optional, if given a few days of lead time.
> Self-hosting is more a question of responsibility I'd say. I am running a couple of SaaS products and self-host at much better performance at a fraction of the cost of running this on AWS
It is. You need to answer the question: what are the consecuences of your service being down for lets say 4 hours or some security patch isn't properly applied or you have not followed the best practices in terms of security? Many people are technically unable, lack the time or the resources to be able to confidently address that question, hence paying for someone else to do it.
Your time is money though. You are saving money but giving up time.
Like everything, it is always cheaper to do it (it being cooking at home, cleaning your home, fixing your own car, etc) yourself (if you don't include the cost of your own time doing the service you normally pay someone else for).
> Like everything, it is always cheaper to do it (it being cooking at home, cleaning your home, fixing your own car, etc) yourself (if you don't include the cost of your own time doing the service you normally pay someone else for).
In a business context the "time is money" thing actually makes sense, because there's a reasonable likelihood that the business can put the time to a more profitable use in some other way. But in a personal context it makes no sense at all. Realistically, the time I spend cooking or cleaning was not going to earn me a dime no matter what else I did, therefore the opportunity cost is zero. And this is true for almost everyone out there.
Heh, true. Although in fairness I said the business can repurpose the time to make money, not that they will. I'm splitting hairs, but it seems in keeping with the ethos here. ;)
You can pay someone else to manage your hardware stack, there are literal companies that will just keep it running, while you just deploy your apps on that.
> It is. You need to answer the question: what are the consecuences of your service being down for lets say 4 hours or some security patch isn't properly applied or you have not followed the best practices in terms of security?
There is one advantage self hosted setup has here, if you set up VPN, only your employees have access, and you can have server not accessible from the internet. So even in case of zero day that WILL make SaaS company leak your data, you can be safe(r) with self-hosted solution.
> Your time is money though. You are saving money but giving up time.
The investment compounds. Setting up infra to run a single container for some app takes time and there is good chance it won't pay back for itself.
But 2nd service ? Cheaper. 5th ? At that point you probably had it automated enough that it's just pointing it at docker container and tweaking few settings.
> Like everything, it is always cheaper to do it (it being cooking at home, cleaning your home, fixing your own car, etc) yourself (if you don't include the cost of your own time doing the service you normally pay someone else for).
It's cheaper if you include your own time. You pay a technical person at your company to do it. Saas company does that, then pays sales and PR person to sell it, then pays income tax to it, then it also needs to "pay" investors.
Yeah making a service for 4 people in company can be more work than just paying $10/mo to SaaS company. But 20 ? 50 ? 100 ? It quickly gets to point where self hosting (whether actually "self" or by using dedicated servers, or by using cloud) actually pays off
Unironically - I agree. You should be outsourcing things that aren't your core competency. I think many people on this forum have a certain pride about doing this manually, but to me it wouldn't make sense in any other context.
Could you imagine accountants arguing that you shouldn't use a service like Paychex or Gusto and just run payroll manually? After all it's cheaper! Just spend a week tracking taxes, benefits and signing checks.
Self-hosting, to me, doesn't make sense unless you are 1.) doing something not offered by the cloud or a pathological use case 2.) or running a hobby project or 3.) you are in maintaince mode on the product. Otherwise your time is better spent on your core product - and if it isn't, you probably aren't busy enough. If the cost of your RDS cluster is so expensive relative to your traffic, you probably aren't charging enough or your business economics really don't make sense.
I've managed large database clusters (MySQL, Cassandra) on bare metal hardware in managed colo in the past. I'm well aware of the performance thats being left on the table and what the cost difference is. For the vast majority of businesses, optimizing for self hosting doesn't make sense, especially if you don't have PMF. For a company like 37signals, sure, product velocity probably is very high, and you have engineering cycles to spare. But if you aren't profitable, self hosting won't make you profitable, and your time is better spent elsewhere.
You can outsource everything, but outsourcing critical parts of the company may also put the existence of the company in the hand of a third-party. Is that an acceptable risk?
Control and risk management cost money, be that by self hosting or contracts. At some point it is cheaper to buy the competence and make it part of the company rather than outsource it.
I think you and I simply disagree about your database being a core/critical part of your stack. I believe RDS is good enough for most people, and the only advantage you would have in self hosting is shaving 33% off your instance bill. I'd probably go a step further and argue that Neon/CockroachDB Serverless is good enough for most people.
I'm totally with you on the core vs. context question, but you're missing the nuance here.
Postgres's operations is part of the core of the business. It's not a payroll management service where you should comparison shop once the contract comes up for renewal and haggle on price. Once Postgres is the database for your core systems of record, you are not switching away from it. The closest analog is how difficult it is/was for anybody who built a business on top of an Oracle database, to switch away from Oracle. But Postgres is free ^_^
The question at heart here is whether the host for Postgres is context or core. There are a lot of vendors for Postgres hosting: AWS RDS and CrunchyData and PlanetScale etc. And if you make a conscious choice to outsource this bit of context, you should be signing yearly-ish contracts with support agreements and re-evaluating every year and haggling on price. If your business works on top of a small database with not-intense access needs, and can handle downtime or maintenance windows sometimes, there's a really good argument for treating it that way.
But there's also an argument that your Postgres host is core to your business as well, because if your Postgres host screws up, your customers feel it, and it can affect your bottom line. If your Postgres host didn't react in time to your quick need for scaling, or tuning Postgres settings (that a Postgres host refuses to expose) could make a material impact on either customer experience or financial bottom-line, that is indeed core to your business. That simply isn't a factor when picking a payroll processor.
Ignoring the fact that the assumption that you will automatically have as good or better uptime than a cloud provider, I just feel like you just simply aren't being thoughtful enough with the comparison. Like in what world is payroll not as important as your DBMS - if you can't pay people you don't have a business!
If your payroll processor screws up and you can't pay your employees or contractors, that can also affect your bottom line. This isn't a hypothetical - this is a real thing that happened to companies that used Rippling.
If your payroll processor screws up and you end up owing tens of thousands to ex-employees because they didn't accrue vacation days correctly, that can squeeze your business. These are real things I've seen happen.
Despite these real issues that have jammed up businesses before rarely do people suggest moving payroll in-house. Many companies treat Payroll like cloud, with no need for multi-year contracts, Gusto lets you sign up monthly with a credit card and you can easily switch to rippling or paychex.
What I imagine is you are innately aware of how a DBMS can screw up, but not how complex payroll can get. So in your world view payroll is a solved problem to be outsourced, but DBMS is not.
To me, the question isn't whether or not my cloud provider is going to have perfect uptime. The assumption that you will achieve better uptime and operations than cloud is pure hubris; it's certainly possible, but there is nothing inherent about self-hosting that makes it more resilient. The question is your use case differentiated enough where something like RDS doesn't make sense. If it's not, your time is better spent focused on your business - not setting up dead man switches to ensure your database backup cron is running.
> Like in what world is payroll not as important as your DBMS - if you can't pay people you don't have a business!
Most employees, contractors, and vendors are surprisingly forgiving of one-time screw-ups. Hell, even the employees who are most likely to care the most about a safe, reliable paycheck - those who work for the US federal government - weren't paid during the recent shutdown, and not for the first time, and still there wasn't some massive wave of resignations across the civil service. If your payroll processor screws up that badly, you fire them and switch processors.
If your DBMS isn't working, your SaaS isn't working. Your SLA starts getting fucked and your largest customers are using that SLA as reason to stop payments. Your revenue is fucked.
Don't get me wrong, having working payroll is pretty important. But it's not actually critical the way the DBMS is, and if it was, then yeah you'd see more companies run it in-house.
>Most employees, contractors, and vendors are surprisingly forgiving of one-time screw-ups.
If you are a new business that isn't true. Your comparison to the US federal government is not apt at all - the USG is one of the longest running, stable organizations in the country, people will have plenty of patience for the USG, but they wont have it for your incorporated-last-month business.
Secondly I could make the same argument for AWS. AWS has plenty of downtime - way more than the USG has shutdowns, and there are never been a massive wave of customers off of AWS.
Finally, as a small business, if your payroll gets fucked, your largest assets will use that to walk out the door! The second you miss payroll is the second your employees start seeing the writing on the wall, its very hard to recover moral after that. Imagine being Uber and not paying drivers on time, they will simply drive more often with a competitor.
That said, I still see the parallels with the hypothetical "Accountant forums". The subject matter experts believe their shiny toy is the most critical to the business and the other parts aren't. Replace "US federal government" with "Amazon Web Services", and you will have your "Accountant forums" poster arguing why payroll should be done in house and SLA doesn't matter.
That’s pretty reductive. By that logic the opposite extreme is just as true: if using managed services is just as bad as outsourcing everything else, then a business shouldn’t rent real estate either—every business should build and own their own facility. They should also never contract out janitorial work, nor should they retain outside law firms—they should hire and staff those departments internally, every time, no nuance allowed.
You see the issue?
Like, I’m all for not procuring things that it makes more sense to own/build (and I know most businesses have piss-poor instincts on which is which—hell, I work for the government! I can see firsthand the consequences of outsourcing decision making to contractors, rather than just outsourcing implementation).
But it’s very case-by-case. There’s no general rule like “always prefer self hosting” or “always rent real estate, never buy” that applies broadly enough to be useful.
I'll be reductive in conversations like this just to help push the pendulum back a little. The prevailing attitude seems (to me) like people find self-hosting mystical and occult, yet there's never been a better time to do it.
> But it’s very case-by-case. There’s no general rule like “always prefer self hosting” or “always rent real estate, never buy” that applies broadly enough to be useful.
I don't know if anyone remembers that irritating "geek code" thing we were doing a while back, but coming up with some kind of shorthand for whatever context we're talking about would be useful.
No argument here, that’s a fair and thoughtful response, and you’re not wrong regarding the prejudice against self-hosting (and for what it’s worth I absolutely come from the era where that was the default approach, have done it extensively, like it, and still do it/recommend it when it makes sense).
> The Geek Code, developed in 1993, is a series of letters and symbols used by self-described "geeks" to inform fellow geeks about their personality, appearance, interests, skills, and opinions. The idea is that everything that makes a geek individual can be encoded in a compact format which only other geeks can read. This is deemed to be efficient in some sufficiently geeky manner.
That argument does not hold when there is aws serverless pg available, which cost almost nothing for low traffic and is vastly superior to self hosting regarding observability, security, integration, backup ect...
There is no reason to self manage pg for dev / environnement.
"which cost almost nothing for low traffic" you invented the retort "what about high traffic" within your own message. I don't even necessarily mean user traffic either. But if you constantly have to sync new records over (as could be the case in any kind of timeseries use-case) the internal traffic could rack up costs quickly.
"vastly superior to self hosting regarding observability" I'd suggest looking into the cnpg operator for Postgres on Kubernetes. The builtin metrics and official dashboard is vastly superior to what I get from Cloudwatch for my RDS clusters. And the backup mechanism using Barman for database snapshots and WAL backups is vastly superior to AWS DMS or AWS's disk snapshots which aren't portable to a system outside of AWS if you care about avoiding vendor lock-in.
It only scales down after a period of inactivity though - it’s not pay-per-request like other serverless offerings. DSQL looks to be more cost effective for small projects if you can deal with the deviations from Postgres.
Ah, good to know, I hadn't seen that V2 update. Looks like a min 5m inactivity to auto-pause (i.e., scale to 0), and any connection attempt (valid or not) resumes the DB.
Depends on who you ask. I guess Drew, who posted it here, may beg to differ.
reply