* Do you do batch and/or streaming computation?
* What kind of access do you have for querying/accessing the data this role will predominately be working using?
* Do you have dedicated data engineers and data infrastructure people
* What's your workflow orchestration engine?
* What is the data access pattern for historical data (there had better be at least SQL access here).
* Do you have built-in feedback loops for your machine learning products?
* What is your serialization format of record for production and for OLAP?
* How often do schemas change in your databases?