CPU performance is much better than mainline llama, as well as having more quantization types available