More

pipe_connector · on July 7, 2024

MongoDB has supported the equivalent of Postgres' serializable isolation for many years now. I'm not sure what "with strong consistency benefits" means.

zihotki · on July 7, 2024

Or is it? Jepsen reported a number of issues like "read skew, cyclic information flow, duplicate writes, and internal consistency violations. Weak defaults meant that transactions could lose writes and allow dirty reads, even downgrading requested safety levels at the database and collection level. Moreover, the snapshot read concern did not guarantee snapshot unless paired with write concern majority—even for read-only transactions."

That report (1) is 4 years old, many things could have changed. But so far any reviewed version was faulty in regards to consistency.

1 - https://jepsen.io/analyses/mongodb-4.2.6

pipe_connector · on July 7, 2024

Jepsen found a more concerning consistency bug than the above results when Postgres 12 was evaluated [1]. Relevant text:

We [...] found that transactions executed with serializable isolation on a single PostgreSQL instance were not, in fact, serializable

I have run Postgres and MongoDB at petabyte scale. Both of them are solid databases that occasionally have bugs in their transaction logic. Any distributed database that is receiving significant development will have bugs like this. Yes, even FoundationDB.

I wouldn't not use Postgres because of this problem, just like I wouldn't not use MongoDB because they had bugs in a new feature. In fact, I'm more likely to trust a company that is paying to consistently have their work reviewed in public.

1. https://jepsen.io/analyses/postgresql-12.3

endisneigh · on July 7, 2024

That’s been resolved for a long time now (not to say that MongoDB is perfect, though).

nick_ · on July 7, 2024

I just want to point out that 4 years is not a long time in the context of consistency guarantees of a database engine.

I have listened to Mongo evangelists a few times despite my skepticism and been burned every time. Mongo is way oversold, IMO.

vorticalbox · on July 7, 2024

That is for mongo 4.x but latest stable is 6.0.7 which has note More resilient operations and Additional data security.

https://www.mongodb.com/blog/post/big-reasons-upgrade-mongod...

mmcclimon · on July 8, 2024

FWIW, the latest stable release is 7.0.12, released a week or so ago: https://www.mongodb.com/docs/upcoming/release-notes/7.0/. (I'm not sure why the URL has /upcoming/ in it, actually: 7.0 is definitely the stable release.)

throwup238 · on July 7, 2024

> I'm not sure what "with strong consistency benefits" means.

"Doesn't use MongoDB" was my first thought.

danpalmer · on July 8, 2024

MongoDB had "strong consistency" back in 2013 when I studied it for my thesis. The problem is that consistency is a lot bigger space than being on or off, and MongoDB inhabited the lower classes of consistency for a long time while calling it strong consistency which lost a lot of developer trust. Postgres has a range of options, but the default is typically consistent enough to make most use-cases safe, whereas Mongo's default wasn't anywhere close.

They also had a big problem trading performance and consistency, to the point that for a long time (v1-2?) they ran in default-inconsistent mode to meet the numbers marketing was putting out. Postgres has never done this, partly because it doesn't have a marketing team, but again this lost a lot trust.

Lastly, even with the stronger end of their consistency guarantees, and as they have increased their guarantees, problems have been found again and again. It's common knowledge that it's better to find your own bugs than have your customers tell you about them, but in database consistency this is more true than normal. This is why FoundationDB are famous for having built a database testing setup before a database (somewhat true). It's clear from history that MongoDB don't have a sufficiently rigorous testing procedure.

All of these factors come down to trust: the community lacks trust in MongoDB because of repeated issues across a number of areas. As a result, just shipping "strong consistency" or something doesn't actually solve the root problem, that people don't want to use the product.

pipe_connector · on July 8, 2024

It's fair to distrust something because you were burned by using it in the past. However, both the examples you named -- Postgres and FoundationDB -- have had similar concurrency and/or data loss bugs. I have personally seen FoundationDB lose a committed write. Writing databases is hard and it's easy to buy into marketing hype around safety.

I think you should reconsider your last paragraph. MongoDB has a massive community, and many large companies opt to use it for new applications every day. Many more people want to use that product than FoundationDB.

daniel-grigg · on July 8, 2024

Can you elaborate on why ‘many large companies’ are choosing MongoDB over alternatives and what their use cases are? I’ve been using Mdb for a decade and with how rich the DB landscape is for optimising particular workloads I just don’t see what the value proposition is for Mdb is compared to most of them. I certainly wouldn’t use it for any data intensive application when there’s other fantastic OLAP dbs, nor some battle hardened distributed nodes use case, so that leaves a ‘general purpose db with very specific queries and limited indexes’. But then why not just use as PG as others say?

azurelake · on July 13, 2024

I’d be curious to hear more detail about the FoundationDB data loss issue that you saw? Do you remember what version / what year that you saw it?

nijave · on July 8, 2024

Have you looked at versions in the last couple years to see if they've made progress?

danpalmer · on July 8, 2024

This kinda misses my point. By having poor defaults in the past, marketing claims at-odds with reality, and being repeatedly found to have bugs that reduce consistency, the result is that customer have no reason to trust current claims.

They may have fixed everything, but the only way to know that is to use it and see (because the issue was trusting marketing/docs/promises), and why should people put that time in when they've repeatedly got it wrong, especially when there are options that are just better now.

nijave · on July 9, 2024

Right, I was curious if you put even more time in :)

I see lots of comments from people insisting it's fixed now but it's hard to validate what features they're using and what reliability/durability they're expecting.

throwaway2037 · on July 8, 2024

    > my thesis

Can you share a link? I would like to read your research.

Izkata · on July 7, 2024

> MongoDB has supported the equivalent of Postgres' serializable isolation for many years now.

That would be the "I" in ACID

> I'm not sure what "with strong consistency benefits" means.

Probably the "C" in ACID: Data integrity, such as constraints and foreign keys.

https://www.bmc.com/blogs/acid-atomic-consistent-isolated-du...

lkdfjlkdfjlg · on July 7, 2024

> Pongo - Mongo but on Postgres and with strong consistency benefits.

I don't read this as saying it's "MongoDB but with...". I read it as saying that it's Postgres.

jokethrowaway · on July 7, 2024

Have you tried it in production? It's absolute mayhem.

Deadlocks were common; it uses a system of retries if the transaction fails; we had to disable transactions completely.

Next step is either writing a writer queue manually or migrating to postgres.

For now we fly without transaction and fix the occasional concurrency issues.

pipe_connector · on July 7, 2024

Yes, I have worked on an application that pushed enormous volumes of data through MongoDB's transactions.

Deadlocks are an application issue. If you built your application the same way with Postgres you would have the same problem. Automatic retries of failed transactions with specific error codes are a driver feature you can tune or turn off if you'd like. The same is true for some Postgres drivers.

If you're seeing frequent deadlocks, your transactions are too large. If you model your data differently, deadlocks can be eliminated completely (and this advice applies regardless of the database you're using). I would recommend you engage a third party to review your data access patterns before you migrate and experience the same issues with Postgres.

akoboldfrying · on July 7, 2024

>Deadlocks are an application issue.

Not necessarily, and not in the very common single-writer-many-reader case. In that case, PostreSQL's MVCC allows all readers to see consistent snapshots of the data without blocking each other or the writer. TTBOMK, any other mechanism providing this guarantee requires locking (making deadlocks possible).

So: Does Mongo now also implement MVCC? (Last time I checked, it didn't.) If not, how does it guarantee that reads see consistent snapshots without blocking a writer?

devit · on July 8, 2024

Locking doesn't result in deadlocks, assuming that it's implemented properly.

If you know the set of locks ahead of time, just sort them by address and take them, which will always succeed with no deadlocks.

If the set of locks isn't known, then assign each transaction an increasing ID.

When trying to take a lock that is taken, then if the lock owner has higher ID signal it to terminate and retry after waiting for this transaction to terminate, and sleep waiting for it to release the lock.

Otherwise if it has lower ID abort the transaction, wait for the conflicting transaction to finish and then retry the transaction.

This guarantees that all transactions will terminate as long as each would terminate in isolation and that a transaction will retry at most once for each preceding running transaction.

It's also possible to detect deadlocks by keeping track of which thread every thread is waiting for and signaling the either the highest transaction ID in the cycle or the one the lowest ID is waiting for to abort, wait for ID it was waiting for terminate and retry.

akoboldfrying · on July 14, 2024

Yes, I'm aware that deadlock can be avoided if the graph having an edge uv whenever a task tries to acquire lock v while already holding lock u is acyclic, and this property can either be guaranteed by choosing a total order on locks and then only ever acquiring them in this order or, or dynamically maintained by detecting tasks that potentially violate this order and terminating them, plus retries.

However, those techniques apply only to application code where you have full control over how locks are acquired. This is generally not the case when feeding declarative SQL queries to a DBMS, part of whose job is to decide on a good execution plan. And even in application code, assuming a knowledgeable programmer, they need to either know about all locks in the world or run complex and expensive bookkeeping to detect and break deadlocks.

The fundamental problem is that locks don't compose the way other natural CS abstractions (like, say, functions) do: https://stackoverflow.com/a/2887324

pipe_connector · on July 7, 2024

MongoDB (via WiredTiger) has used MVCC to solve this problem since transactions were introduced.

threeseed · on July 7, 2024

> Next step is either writing a writer queue manually

You can just use a connection pool and limit writer threads.

You should be using one to manage your database connections regardless of which database you are using.

pipe_connector · on May 9, 2024

Agreed that this property is useful for efficiency. The danger comes from users believing this property is a guarantee and making correctness decisions based on it. There is no such thing as a distributed lock or exclusive hold.

Justsignedup · on May 10, 2024

This is more or less where my brain was heading. The utility of exactly once system is quite high. But the utility of coding knowing that it is at least once, hopefully, and if not we'll do a catch-up, means that I will code knowing that my work must be safe in an environment with these constraints.

The best argument I have for a postgres background worker queue is simplicity. You're already managing a db, why not have the db simplify your startup's infrastructure. Honestly, with heroku, sidekiq, and redis it isn't even simplifying. But for startups simplicity is useful so they can scale when they need to and avoid complexity when possible. But that's the only argument I'd make.

pipe_connector · on Aug 11, 2023

I agree with the characterization of applications you've laid out and think everyone should consider whether they're working on a "tall" (most users use a narrow band of functionality) or a "wide" (most users use a mostly non-overlapping band of functionality) application.

I also agree with your take that tall applications are generally easier to build engineering-wise.

Where I disagree is that I think in general wide applications are failures in product design, even if profitable for a period of time. I've worked on a ton of wide applications, and each of them eventually became loathed by users and really hard to design features for. I think my advice would be to strive to build a tall application for as long as you can muster, because it means you understand your customers' problems better than anyone else.

feoren · on Aug 11, 2023

> I've worked on a ton of wide applications, and each of them eventually became loathed by users and really hard to design features for.

Yes, I agree that this is the fate of most. But I refuse to believe it's inevitable; rather, I think it comes from systemic flaws in our design thinking. Most of what we learn in a college database course, most of what we read online, most all ideas in this space, transfer poorly to "wide" design. People don't realize this because those approaches do work well for tall applications, and because they're regarded religiously. This is why I call them so much harder.

lll-o-lll · on Aug 12, 2023

> Yes, I agree that this is the fate of most. But I refuse to believe it's inevitable

Yes exactly. It is not inevitable, I’ve worked on several “enterprise” software suits that did not suffer from this problem. However! They all had that period in their history where they did, and this is why:

Early on in a companies history there will be a number of “big” customers from whom most of the revenue is coming. To keep those customers and money flowing, often bespoke features are added for these customers and these accumulate over time. This is equivalent in character to maintaining several forks of an OSS project. Long term no forward progress can be made due to all time ending up in maintenance.

The solution to this sorry state is to transition to an “all features must be general for the product” and ruthlessly enforce this. That will also mean freezing customer specific “branches” and there will be a temporary hit to revenue. Customers need to be conditioned to the “no bespoke features” and they need to be sold on the long term benefits and be brought along for the ride.

This then enables massive scaling benefits, and the end of all your time in maintenance.

pipe_connector · on Aug 9, 2023

https://packaging-con.org/about

Lorem ipseum -- whoops!

seabass-labrax · on Aug 9, 2023

But at least now we know who's making the bespoke tableware for the conference!

pipe_connector · on Aug 6, 2023

How do you authenticate from a machine that isn't local to you? I don't do any work on my work-issued laptop, I use a powerful remote machine instead.

johncolanduoni · on Aug 6, 2023

Newer openssh clients and servers can use FIDO2-augmented private keys (these are the key types like ed25519-sk). Basically you have a normal keypair stored on the client device, plus the server requires a passing a FIDO2 challenge against the yubikey.

pipe_connector · on Aug 6, 2023

Maybe I'm just missing something, let me explain:

I've already ssh'd to my work machine. I want to send an HTTP request to my company's internal web API from that machine, but we only use webauthn credentials. I'm going to use curl to send the request to the web API. With basic username/password auth or totp it's easy for me to write a script that prompts me for my password/totp code and marshals in into the expected format. How do I do this with my FIDO2 private key in a way that doesn't completely undermine the whole process?

androidbishop · on Aug 6, 2023

I'm not sure you can. If it is possible, it probably requires some open-source tools and a pretty painful process to get the credentials off a hardware token (if that's even possible) and go through the various API calls.

Maybe there's something here?

https://github.com/herrjemand/awesome-webauthn

https://github.com/Yubico/yubikey-manager

deng · on Aug 6, 2023

No, you cannot do a Webauthn authentication with curl. You would need to redirect to a Javascript-capable browser to do the authentication, and then use whatever the service returns as a token with curl (cookie, JWT, ...).

I mean, we already have this problem with stuff like OAuth2. Usually, at some point in the process, you will need to enter your credentials in some JS-capable browser.

johncolanduoni · on Aug 7, 2023

The usual process is for your script to do an OAuth flow on an embedded web server with Okta or whatever, and to port forward that embedded server to your client machine. VS code remote handles this pretty well for example.

scrps · on Aug 6, 2023

This is a bit batty and not sure it would work but I wonder if you could expose /dev/hidraw using sshfs then your work machine would see it as a local yubikey.

deng · on Aug 6, 2023

In your ssh config:

    Host my-trusted-powerful-remote-machine.whatever.com
        ForwardAgent yes

There is still one problem if you like to re-use long-running screen/tmux sessions, for a solution to this see for instance https://gist.github.com/martijnvermaat/8070533

pipe_connector · on Aug 6, 2023

Doesn't this only solve the problem for resources I am accessing over SSH? What about if I wanted to access something over HTTP like my web browser does?

deng · on Aug 6, 2023

That is correct. If you actually use a browser remotely, you would need to use something like RDP with the WebAuthn Virtual Channel enabled, which unfortunately I think is currently only available by Microsoft. Some remote control software like Teamviewer has USB passthrough, but I've no idea if that works with Yubikeys (I doubt it).

So yes, working with what I'd call a "thin client setup" is something where Yubikeys are probably not a good fit, unless the protocol for that setup would support some kind of direct USB forward that actually works with Yubikeys...

stevekemp · on Aug 6, 2023

Install a HTTPS? proxy on the work-machine, and configure the other host to use that?

All requests would then route via the work-computer.

But honestly? Use the work computer, and if it isn't good enough ask for a better machine and let somebody else take care of it.

NovemberWhiskey · on Aug 6, 2023

But seriously what do you do for that case if the resource requires password authentication via an OIDC redirect or whatever?

visualphoenix · on Aug 6, 2023

If the remote host is trusted, you just forward the gpg-agent over ssh to your remote host.

pipe_connector · on Aug 6, 2023

Sorry, I think I missed something because the article doesn't mention GPG at all. How can you make a webauthn client defer to gpg-agent?

pxc · on Aug 6, 2023

When GPG is your ssh agent, you can use RSA or ed25519 keys stored on a smartcard (like a Yubikey) to authenticate via SSH.

It's generally preferable to use a `-sk` key type, though, by which the remote server can essentially enforce that you're using a smartcard and not a normal keypair backed by a file.

pipe_connector · on Aug 6, 2023

Sure, I understand how to authenticate to my remote machine with a smartcard (and already do use this setup). I'm wondering how to authenticate to resources (over HTTP) from my remote machine while using webauthn.

walth · on Aug 6, 2023

Just -D 8080 on your SSH connection and use the local SOCKS5 proxy to tunnel all local web traffic via remote machine.

pipe_connector · on July 29, 2023

While this might make this situation more likely to occur, you can never prevent concurrent accesses from happening in a distributed system.

pipe_connector · on July 12, 2023

This Bubblewrap is much older than the one you've linked.

pipe_connector · on June 23, 2023

Lots of people being negative about this, but if you've ever implemented anything that works in near-real-time at wide scale, most of this design makes sense and it works great.

One thing interested me: Why the difference in pathing between events and messages? I think the event flow makes sense, but why not have messages also go through your gateway server instead of through webapp? Surely there is needless latency there when you already have an active websocket open to gateway? I thought perhaps it was because your gateway was for egress-only, but then the events section made it clear you also handle ingress there.

klabb3 · on June 24, 2023

My guess: it’s persisted mutations that need their own retry- and deduplication logic, as well as user facing error handling. Since it hits the db in the main region anyway latency is similar.

jhgg · on June 24, 2023

Yeah it's a pretty cool design. Discord's real time system operates on a similar (although much more complicated system) on top of hash rings as well.

pipe_connector · on June 22, 2023

You can still use a dedicated diff tool for these situations if they come up regularly. In my experience they are better than vim anyhow.

riffraff · on June 22, 2023

Vim is not a very good diff tool (e.g. no word diff, tho there's a plugin) but it's good enough for most cases and it's nice to never have to switch the way you work.

pipe_connector · on June 4, 2023

kubernetes can use anything that conforms to the CRI interface, which in practice is either CRI-O (RedHat) or Containerd (Docker, Inc.). Podman and Docker are also consumers of both of those engines