Modern compilers may actually do this as well, though for slightly different reasons -- GCC (at the right optimization level) likes to put branch targets and functions at aligned addresses, so you often end up with little pads of NOP instructions scattered throughout your binaries (run 'objdump -d' on something built with 'gcc -O2' to see firsthand, if you're interested).
I actually have a mostly-written LD_PRELOAD library lying around that exploits this for purposes more like the ones you describe though -- hot-patching code in memory at program load to dispatch system calls via little dynamically-generated trampolines so you can insert calls to arbitrary tracing functions. Perhaps I'll polish it up a bit and toss it on github...
I actually have a mostly-written LD_PRELOAD library lying around that exploits this for purposes more like the ones you describe though -- hot-patching code in memory at program load to dispatch system calls via little dynamically-generated trampolines so you can insert calls to arbitrary tracing functions. Perhaps I'll polish it up a bit and toss it on github...