Yeah, the current solution is similar to overfitting, this wont generalize to harder math where the operation doesn't correspond to the activation function of the network.
Is this true? If it can add, it can probably subtract? If it can add and subtract it may be able to multiply (repeated addition) and divide? If it can multiply it can do exponents?
I don’t know, but cannot jump to your conclusion without much more domain knowledge.
An 8x8 bit multiplication only requires 7 additions, either in parallel or sequentially. Remember long-form multiplication? [1] It's the same principle. Of course, high-speed digital multiplication circuits use a much more optimized, much more complex implementation.