So many GCP products are surprisingly terrible. Certainly not all of them, some are really good, like GKE, Cloud Storage and Cloud Load Balancer. But Cloud SQL is pretty weak, and products like Cloud Logging, Cloud Metrics and Cloud Tracing are legitimately terrible. Cloud NAT is pretty sketchy too, and can lead to a lot of egress issues if not configured perfectly.
My current workplace uses GCP, my last workplace used AWS, and personally I’ve found AWS to have much higher average quality. At my current workplace we’ve stopped using Cloud SQL, and moved our Postgres usage to Aiven (with VPC peering). Aiven seem to do a much better job operating Postgres than GCP do.
Basically, their Cloud Tracing product is broken for modern Node/Postgres (in terms of showing PG queries and whatnot in traces), users have found the issue (and a seemingly super simple fix), but it’s been over a year and Google still haven’t fixed it. Google’s response is “yeah, we know pretty core functionality of this product is broken, but we’re not fixing it in the near future.” Or maybe ever? Many of their products feel semi-abandoned like this, especially in their observably stack - major bugs and/or performance issues that they never fix, and extremely limited features.
Cloud SQL isn’t terrible, but at least the Postgres version is one of the weaker managed Postgres offerings out there. And their whole observability stack (Logging/Monitoring/Tracing/Error Reporting) is legit terrible compared to competing products. Compared to other products I’ve used in the space, Cloud Logging is unbelievably worse than Sumo Logic, Cloud Metrics soooo much worse than Grafana+Prometheus, Cloud Tracing way worse than offerings from Datadog or New Relic, Cloud Error Reporting is ridiculously far behind Sentry, etc.
The GCP options are often quite cheap, but it shows in their extremely limited features, poor performance and plentiful bugs. Go with GCP for the things they do well, but don’t bother adopting their solution for everything simply to stick with one platform, as so many of their products are just so poor compared to competitors.
> Cloud SQL isn’t terrible, but at least the Postgres version is one of the weaker managed Postgres offerings out there. And their whole observability stack (Logging/Monitoring/Tracing/Error Reporting) is legit terrible compared to competing products. Compared to other products I’ve used in the space, Cloud Logging is unbelievably worse than Sumo Logic, Cloud Metrics soooo much worse than Grafana+Prometheus, Cloud Tracing way worse than offerings from Datadog or New Relic, Cloud Error Reporting is ridiculously far behind Sentry, etc.
Fair enough, we never used CloudWatch at my previous company, where we did use AWS for most infra, but didn’t use CloudWatch, so don’t have much experience. But we do use Cloud ops suite (a.k.a. Stackdriver) at my current co, and man, I miss the more standalone observability tools we used at my previous co - Sumo, Prometheus/Grafana, Sentry and New Relic. They’re sooooooo far ahead it’s not even funny.
I'm gonna use your disasfaction of your log management tool for a little self-promotion. I'm working as part of the Wrble.com team its a fast logging platform that is priced way lower than hosting your open source stack. We are based on Lucene technology. I believe we provide the same service that existing players do for an 80% reduction in spending. Give it a shot let me know what you think.
Looks interesting, but can you only do very structured queries of JSON logs? We like to be able to do full text search on the whole log, i.e. find every log with a specific UUID in it, regardless of where in the log the UUID is.
Also, looks like no log aggregation? i.e. no SQL style queries on logs, that you can do in products like Sumo Logic.
GREAT pricing, but my first impression is that it’s lacking some key features we’re looking for. Seems like you guys are going for the low priced, bare bones solution, and it’s literally orders of magnitudes cheaper than a really feature rich solution like Sumo, but I think it’s too stripped down for us.
Not really surprising if you consider their likely motivations.
Google isn't in the business of selling things to end users, they're in the business of selling ads. The only thing GCP gives them (outside of getting wall streeters off their backs a few years ago when everyone and their brother was starting a cloud service) is a credit to their own infrastructure cost by selling excess to random joes.
Therefore I'm not surprised that AWS continues to be the defacto, they do sell things to end users. I'm not surprised that Azure is growing quickly, either, since MS also sells things to end users and they needed a way to transition their on-premise stuff to the wires.
I mean, GCP is a decent source of revenue for them. i.e. last quarter:
* Alphabet did ~$55 billion in revenue overall last quarter, ~$4 billion was from "Cloud", which is GCP + Workspace (I don't think they disclose how much is GCP alone?). Although, for now it's a money loser for them, they had operating losses of ~$1 billion for Cloud, but the operating losses are shrinking over time, it'll become profitable eventually
* In contrast, Amazon did ~$108 billion in revenue overall last quarter, and ~$13.5 billion was from AWS. Although unlike GCP, AWS is highly profitable, ~$4 billion in operating income for the quarter, which is almost half of Amazon's total operating income
But AWS isn't THAT much higher a percentage of Amazon's revenue than GCP is of Alphabet's revenue. And in terms of COSTS, AWS is actually spending less, relative to their overall revenue (Amazon spending ~$9.5 billion of $108 billion total revenue on AWS, Google spending ~$5 billion of $55 billion total revenue on "Cloud").
AWS has been around longer than GCP, and they've certainly spent more absolute dollars, so it makes sense it's further ahead and more polished. Yeah, AWS is more used to selling things to end users than Google, they may have a better culture for quality there, but Google invests heavily in GCP, and it's a pretty significant revenue stream for them. I'm guessing their motivations are similar, both see Cloud offerings as a big revenue stream first and foremost.
It's interesting that you are satisfied with GKE. Do you rely on the k8s-API to be (high) available?
We were using the API as our source of truth for Patroni, but we had to configure some really high timeouts in order to compensate regular multi minute API downtimes.
We need the services we run inside K8s to be highly available (as well as K8s ingress), but the K8s API we care less about. We haven't noticed any K8s API downtime issues, but I guess we mostly hit the K8s API during deploys, which for us are likely not frequent enough to notice the downtime you're talking about.
My current workplace uses GCP, my last workplace used AWS, and personally I’ve found AWS to have much higher average quality. At my current workplace we’ve stopped using Cloud SQL, and moved our Postgres usage to Aiven (with VPC peering). Aiven seem to do a much better job operating Postgres than GCP do.