A question to the adept in the filed. Isn't that expected? These models are trai...

A question to the adept in the filed. Isn't that expected? These models are trained by making them guess the next word given a sequence of words. This means that are not modeling causal relationships between the given context (the prompt) and the answer they output. And I believe for solving new problems making these kinds of causal connection is necessary. So, the fact they can actually do better than nothing does it mean that they have a very basic notion of causality or just that human predicting the next word from human content is a poor proxy for such kind of causal relationships?