Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think a lot of data tech has come full circle is now mostly just relational databases. Our org is invested in redshift which lets us mostly pay as we go. The DB itself is just a Postgres facade on scalable storage with some native connectors to file stores and third-parties. After rolling over our stack like three times, we're now just dumping tons of raw data into staging tables, then creating views on top of them. It's 97% raw SQL with a smattering of python for clunky extractions. And we're now true believers in ELT vs ETL.


Redshift with S3 storage is no different to Spark SQL with S3 storage.

Both are distributed compute. Except that Spark allows you to mix/match code with SQL.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: