Laziness requires immutability to work well, and that means you need to represent mutations such as IO operations in an immutable cover like the IO monad.
You've got it backwards. Laziness requires mutation to work well. Laziness is just a thunk (a closure) that can be overwritten by a value at a later time. It is impossible to implement laziness if you cannot mutate memory: if the address of the unevaluated closure and the evaluated value cannot be the same, you have to update references to the closure into references to the value, but you cannot do that if you can't mutate memory!
People always assume garbage collectors in Haskell might somehow be different due to immutability. But due to laziness it works just like Java garbage collection because laziness requires mutability.
I don't quite understand. You don't update any references when thunks are evaluated, GHC's runtime does. The runtime shields you from any mutation, doesn't it? (Unless you explicitly ask it to let you mutate a value in memory, of course. But that goes beyond the normal evaluation of thunks.)
Oh yes. I just realized we are talking past each other. I'm talking about the implementation detail of laziness. The mutation is internal and usually invisible. The only knob that I know of to adjust it is -feager-blackholing to control whether evaluation results in two mutations or one (except when switching threads: always two in that case).