I have seen many PostgreSQL benchmarks having solid performance with TB data but my real world experience is the complete opposite.
Here are some of the main issues that I have encountered so far:
1. Queries on large tables (around 10 GB) are slow even when "index only scan" is used because of MVCC and the way postgreSQL manages concurrency.
2. Hot-standby instances can't be used for anything serious since all queries are dropped regularly (I believe it's not safe to use "hot_standby_feedback" config to overcome this issue).
3. It is not possible to have tables with heavy "update" workflows. (because of simultaneous autovaccum execution)
I would be very happy if anyone could show me that I am wrong.
1. You'll have to define "slow" - I have a 3TB table where an index only scan takes under 1ms
2. hot_standby_feedback is absolutely safe. I've got 5 hot standbys in prod with that flag enabled
3. Again, it depends on how "heavy" your update throughput is. It is definitely tough to find the right balance to configure autovacuum between "so slow that it can't keep up" and "so fast that it eats up all your I/O"
> 2. Hot-standby instances can't be used for anything serious since all queries are dropped regularly (I believe it's not safe to use "hot_standby_feedback" config to overcome this issue).
Hot-standby instances are fine. We've been using these for GitLab.com for a number of years now, and never have they caused any issues. We do have some code in place that retries queries when they are cancelled by the hot-standby, but IIRC this doesn't happen often.
1 mostly applies to update heavy tables since index only scans use the visibility map, which would be frequently invalidated.
3 is definitely true, especially the larger the table. I've had success splitting frequently updated columns out into their own much smaller table, or any other trick to concentrate updates into a small table. Also MVCC bookkeeping requires updating pages, so an UPDATE that doesn't change any field and SELECT FOR UPDATE will cause the same problem.
Here are some of the main issues that I have encountered so far:
1. Queries on large tables (around 10 GB) are slow even when "index only scan" is used because of MVCC and the way postgreSQL manages concurrency.
2. Hot-standby instances can't be used for anything serious since all queries are dropped regularly (I believe it's not safe to use "hot_standby_feedback" config to overcome this issue).
3. It is not possible to have tables with heavy "update" workflows. (because of simultaneous autovaccum execution)
I would be very happy if anyone could show me that I am wrong.