They do. But I'm guessing you don't scale up 1M+ servers for your Uber canary traffic tests - these are some of the scales Amazon undergoes in these events. The scale is unlike almost any other web property around.
Do they really need 1 million servers? Many of my friends who work at other tech companies need such few servers in comparison even with significantly high traffic that just screams massive inefficiencies...which seems wrong.
But I've never worked at Amazon so I wouldn't know.
I’m not sure how it’s relevant. If you have the infrastructure to send 1000 concurrent users you can probably send 1M concurrent users. We only test small integer multiples of our peak traffic, and if your absolute number of servers to service that is in the millions then it would make absolute sense to be routinely running that capacity test. If that means “scale up 1M+ servers” then that is what you have to do, otherwise how can you be sure?