lots of learning! also sounds like a bit of pre-existing aws state. for round tw...

lots of learning! also sounds like a bit of pre-existing aws state.

for round two, try:

- spinup a new subaccount to ensure you have total control of its state.

- data goes in s3 as jsonl/csv/parquet/etc.

- lambdas on cron manage ephemeral ec2 for when heavy lifting is needed for data ingress, egress, or aggregation.

- lambdas on http manage light lifting. grab an object from s3, do some stuff, return some subset or aggregate of the data.

- data granularity (size and layout in s3) depends on use case. think about the latency you want for light and heavy lifting, and test different lambda/ec2 sizes and their performance processing data in s3.

lambda is a supercomputer on demand billed by the millisecond.

ec2 spot is a cheaper supercomputer with better bandwidth on a 30 second delay billed by the second.

bandwidth to s3 is high and free within an AZ, for ec2 and lambda.

bandwidth is so high that you are almost always bottlenecked on [de]serialization, and then on data processing. switch to go, then maybe to c, for cpu work.

dynamodb is great, but unless you need compare-and-swap, it costs too much.