Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> There's no getting around fsync if you want to be sure that your data is really on the storage medium.

That's not correct; io_uring supports O_DIRECT write requests just fine. Obviously bypassing the cache isn't the same as just flushing it (which is what fsync does), so there are design impacts.

But database engines are absolutely the target of io_uring's feature set and they're expected to be managing this complexity.



O_DIRECT is not a substitute for fsync(). It only guarantees that data gets to the storage device cache, which is not durable in most cases.


My understanding is that the storage device cache is opaque, that is, drives tend to lie, saying the write is done when it is in cache, and depend on having enough internal power capacity to flush on power loss.


Consumer devices sometimes lie (enterprise products less so), but there is a distinction between O_DIRECT and actual fsync at the protocol layer (e.g., in NVMe, fsync maps into a Flush command).


> But database engines are absolutely the target of io_uring's feature set and they're expected to be managing this complexity.

io_uring includes an fsync opcode (with range support). When folks talk about fsync generally here, they're not saying the io_uring is unusable, they're saying that they'd expect the fsync to be used whether it's via the io_uring opcode, the system call, or some other mechanism yet to be created.


That's not what O_DIRECT is for. Did you mean O_SYNC ?


Is that's true (notwithstanding objections from sibling comments) then that's just another spelling of fsync.

My point was really: you can't magically get the performance benefits of omitting fsync (or functional equivalent) while still getting the durability guarantees it gives.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: