It's time for me to re-read the man page for bash. I was not aware of BASH_REMATCH, wow. It's in the first snippet on the linked page, and would save the hassle of using multiple var expansions of the %% and ## et al sort.
Bash is slower than other POSIX compatible shells but once you start running external commands for any substring or replace operation you loose much of this performance edge since forking is comparable slow.
One reason why I personally prefer to use Bashisms like ${x//str/replace} or [[ $value =~ $pattern ]] instead of doing the common x=$(echo $x | sed s/str/replace/) which has to launch two processes just for the sake of avoiding Bashism. (or grep -oP ... which is nice but a BASH_REMATCH is often simpler)
If the script does a lot of unstreamable replacements, you're right. But there are still ways out of bash.
I prefer no fork, no bashism, reusable functions:
set -euf
replace () {
local t=
REPLY=$1
t="${REPLY#*"$2"}"
test "$t" != "$REPLY" || return
REPLY="${REPLY%"$2$t"}$3$t"
}
replace_all () {
REPLY=$1
shift
while replace "$REPLY" "$@"; do :; done
}
input="foo bar foo baz"
replace_all "$input" "foo" "HELLO"
echo $REPLY
Not exactly easy to write, but now that they're functions, it doesn't matter since I can reuse them.
Regarding performance, it is slower than bash, but not significantly.
Times for 1000 calls:
9.908s -- sed (one fork per replacement)
0.015s -- bash x//str/replace
0.088s -- sh replace_all
Times for 50.000 calls:
0.351s -- bash x//str/replace
0.631s -- sh replace_all
Also, you can get further performance by inlining replace inside replace_all instead of making one call another.
Note that I could have done several replacements inside a single sed pipe, but I decided to count the performance for doing it like you suggested x=$(echo $x | sed s/str/replace/). The same goes for my functions, one invokation per replacement (in fact, they are tuned for that scenario).
sed can absoltely beat the shell in scenarios where you can make one fork do lots and lots of replacements. It depends on the scenario, and how proficient you are in writing sed (which can do branching, keep state, all sorts of things portably).
--
From an architectural point of view, it makes sense to have a simpler `sh` and keep a sort of standard library of functions, instead of feature creeping the interpreter with weird arcana. It makes shells like dash easier to maintain, easier to debug, easier to port and safer.
The reason Bash has so many features is that doing these things natively in the shell is faster and more convenient. After all, these features weren't just added randomly.
These features were added slowly, randomly, as time passed. The weird syntaxes for all of these are a clear sign of this.
Practically all shell interpreters suffer from decades of feature creep since the original bourne shell. They're full of weird arcana, hard to maintain and debug.
Many people tried to replace bash and died on the hill because of these weird features, or ended up creating a replacement that is even slower, or ended up rediscovering what perl is.
Oh yeah! I was unaware too! Nowadays I quickly jump to python instead of using Bash even for the simplest of scripts , but this could help creating tiny and easy to understand scripts for some integrations...
> I quickly jump to python instead of using Bash even for the simplest of scripts
You don't seem to respect the old, venerable, well-tested adage: "once your shell script becomes too complex, switch to a real programming language like python".
Or, the zen version (formally equivalent, but with quite a different tone): "once your program becomes sufficiently simple, turn it into a beautiful shell script".
The true power of shell script is to coordinate programs. Once you find yourself altering data with the shell constructs, that's the sign to use $LISP instead.
> The true power of shell script is to coordinate programs. Once you find yourself altering data with the shell constructs, that's the sign to use $LISP instead.
one might awk how much logic one can bash into a script before leaving the beloved shell