Given that this is written on Java and running on a single server with fixed hardware....
Is there and what is the "peak" amount of optimization feasible in Java for this search engine before one would need to turn to C/Rust/etc to get any more performance out of this on the given hardware?
Java's main limitation is probably in access to lower level I/O APIs, as well as vectorization support that is somewhat lackluster. There's almost definitely performance left on the table.
It's relying quite heavily on memory mapped I/O and doing some clever things to work around language limitations in how much you can memory map at a time. This permits surprisingly good but not optimal performance.
A bigger drawback is that this type of low level programming in Java is a serious pain in the ass.
Presumably there is a peak, but Java can be really, really fast.
I recently rewrote a heavy algorithm from Java to Rust, thinking that I'd get faster performance pretty much automatically. It turned out to be significantly slower than my optimized Java algorithm, and I didn't have the experience to tune the Rust version, so I ended up sticking to Java for now.
I'm sure someone who knows how could have tuned the Rust version to get better performance, but native code is not my specialty and the Java version was doing fine.
A warmed up JVM is a lot faster than most people think, especially for a long-running app like a search engine.
Is there and what is the "peak" amount of optimization feasible in Java for this search engine before one would need to turn to C/Rust/etc to get any more performance out of this on the given hardware?