The operations on "entire arrays at once" don't benefit much from compilation, because the interpreter overhead is amortized across the size of the array.
For "one potato two potato" scalar code code with lots of branches and control flow, compiled will always be faster because each little op has overhead and a compiler could optimize it away.
A sufficiently smart interpreter or JIT could theoretically optimize this away too, but as far as I know no APL (or APL-like) language uses anything like a tracing JIT, instead they have historically focused on optimizing idiomatic expressions so "typical" code is faster.
Is the idea because most the action is in already compiled datastructure manipulations, in the standard library, specified by the terse source?