Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Thanks for posting! I am one of the author, happy to answer any question!


I remembered reading about the DBOS paper a while back - https://arxiv.org/abs/2007.11112. Is this an evolution of that research work? If so, how did an OS for databases morph into a workflow orchestration service?


It is an evolution. The DBOS workflow orchestrator places a DB at the center of your application to handle most of the complicated state management problems.


I remember reading that restate.dev is a 'push' based workflow and therefore works well with serverless workflows: https://news.ycombinator.com/item?id=40660568

what is your input on these two topics? aka pull vs push and working well with serverless workflows


Did you consider using NATS? While I haven't tried this deployment model, you can embed it in a go program as a library. If you wanted something really minimal this might be an option.

I use NATS to acheive this type of durable processing. It works well. Of course, idempotent code is needed but I don't think this can be avoided.


We decided to use Postgres because of the relational semantics, the ease of integration with user applications, and it's remarkable popularity


Is it possible for you guys to write a blog post analyzing the usage of the DB (reads, writes, what is stored for each workflow any events etc) to help users planning for scale to really understand what they are signing up.

The library seems fantastic but my team did not use this because at scale they believe that the number of DB reads and writes becomes very significant for a large number of workflows with many steps and that with PG vs Cassandra/ScyllaDB it would not be feasible for our throughput. I tried to convince them otherwise but it is difficult to quantify from the current documentation.


Good call. We'll see how to integrate it in our docs better.

The cost of DBOS durable execution is 1 write per step (checkpoint the outcome) and 2 additional writes per workflows (upsert the workflow status, checkpoint the outcome). The write size is the size of your workflows/steps output.

Postgres can support several thousands writes per seconds (influenced by the write size, ofc): DBOS can thus support several thousands of workflows/steps per second.

Postgres scales remarkably well. In fact, most org will never out scale a single, vertically scaled Postgres instance. There's a very good write up by Figma telling how they scaled Postgres horizontally: https://www.figma.com/blog/how-figmas-databases-team-lived-t...


What did your team decide to go with eventually?


Great project! Love the library+db approach. Some questions:

1. How much work is it to add bindings for new languages? 2. I know you provide conductor as a service. What are my options for workflow recovery if I don't have outbound network access? 3. Considering this came out of https://dbos-project.github.io/, do you guys have plans beyond durable workflows?


1. We also have support for Python and TypeScript with Java coming soon: https://github.com/dbos-inc

2. There are built-in APIs for managing workflow recovery, documented here: https://docs.dbos.dev/production/self-hosting/workflow-recov...

3. We'll see! :)


Elixir? Or does Oban hew close enough, that it’s not worth it?


There's a clear text password in one of your GitHub Action workflows: https://github.com/dbos-inc/dbos-transact-golang/blob/main/....


That password is only used by the GHA to start a local Postgres Docker container (https://github.com/dbos-inc/dbos-transact-golang/blob/main/c...), which is not accessible from outside.


How does DBOS scale in a cluster? with Temporal or Dapr Workflows, applications register running their supported workflows types or activities and the workflow orchestration framework balances work across applications. How does this work in the library approach?

Also, how is DBOS handling workflow versioning?

Looking forward for your Java implementation. Thanks


Good questions!

DBOS naturally scales to distributed environments, with many processes/servers per application and many applications running together. The key idea is to use the database concurrency control to coordinate multiple processes. [1]

When a DBOS workflow starts, it’s tagged with the version of the application process that launched it. This way, you can safely change workflow code without breaking existing ones. They'll continue running on the older version. As a result, rolling updates become easy and safe. [2]

[1] https://docs.dbos.dev/architecture#using-dbos-in-a-distribut...

[2] https://docs.dbos.dev/architecture#application-and-workflow-...


Thanks for the reply.

So applications continuously poll the database for work? Have you done any benchmarking to evaluate the throughput of DBOS when running many workflows, activities, etc.?


In DBOS, workflows can be invoked directly as normal function calls or enqueued. Direct calls don't require any polling. For queued workflows, each process runs a lightweight polling thread that checks for new work using `SELECT ... FOR UPDATE SKIP LOCKED` with exponential backoffs to prevent contentions, so many concurrent workers can poll efficiently. We recently wrote a blog post on durable workflows, queues, and optimizations: https://www.dbos.dev/blog/why-postgres-durable-execution

Throughput mainly comes down to database writes: executing a workflow = 2 writes (input + output), each step = 1 write. A single Postgres instance can typically handle thousands of writes per second, and a larger one can handle tens of thousands (or even more, depending on your workload size). If you need more capacity, you can shard your app across multiple Postgres servers.


Even though I don't use DBOS, that blog post is gold.


Does it natively support job priorities? E.g. if there's 10 workflows submitted and I start up a worker, how does it pick the first job.


Yeah, queue priority is natively supported: https://docs.dbos.dev/golang/tutorials/queue-tutorial#priori...


I read the Dbos vs Temporal thing, but can you speak more about if there is a different in durability guarantees?


The durability guarantees are similar--each workflow step is checkpointed, so if a workflow fails, it can recover from the last completed step.

The big difference, like that blog post (https://www.dbos.dev/blog/durable-execution-coding-compariso...) describes, is the operational model. DBOS is a library you can install into your app, whereas Temporal et al. require you to rearchitect your app to run on their workers and external orchestrator.


This makes sense, but I wonder if there’s a place for DBOS, then, for each language?

For example, a Rust library. Am I missing how a go library is useful for non-go applications?


There are DBOS libraries in multiple languages--Python, TS, and Go so far with Java coming soon: https://github.com/dbos-inc

No Rust yet, but we'll see!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: