Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There are good reasons to use LC_ALL=C for certain commands, especially if you want sorting that's stable across different systems. Sometimes you want "aa" to be before "b" even though the Norwegian UTF-8 locale puts it after "å" (which comes after "…zæø"):

    $ echo $'a\naa\nb\nå'| LC_ALL=C sort 
    a
    aa
    b
    å
    $ echo $'a\naa\nb\nå'| LC_ALL=nn_NO.UTF-8 sort 
    a
    b
    å
    aa
Even worse, some characters that are byte-different collate the same:

    $ echo  $'∨\n∧'|LC_ALL=C sort -u
    ∧
    ∨
    $ echo  $'∨\n∧'|sort -u
    ∨
Handling Unicode sanely requires understanding some Unicode.


I'm well aware of that, but in general, don't do it still applies :-)

In general, I'd never do it for performance only, it would have to involve one of the issues you've pointed out (there are some components I build that require it, erroneously IMO).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: