I think I must be missing something here, but I’ll ask anyway: Why don’t the OS ...

wahern · on Oct 17, 2023

getenv returns a pointer which could be invalidated after releasing the lock, unless the lock also guards uses of that pointer and all application code uses that lock, which they most certainly do not. Likewise, this scheme does not solve direct use of environ by application code.

NetBSD has getenv_r, which copies into a buffer, but few applications use getenv_r, and certainly not all of them. And it doesn't resolve environ.

Solaris never free's env strings or environ arrays, only creating new copies and atomically swapping them. It uses a special allocator for those objects which doubles the backing buffer each time it deep copies the environ array, then argues this strategy is technically asymptotically memory bounded.

EDIT: Glancing at the code I think glibc is similar to Solaris in that it never free's env strings, but it has a heuristic to conditionally free environ arrays which means directly using environ isn't thread-safe.

cryptonector · on Oct 17, 2023

`getenv()` never returns pointers that can be invalidated. If you `putenv()` something, you commit to never freeing it or overwriting it.

hddqsb · on Oct 17, 2023

Good point about `putenv()`; however there is also `setenv()`, which does make a copy, so you are wrong about `getenv()` in general.

POSIX explicitly states "The string [returned by getenv] may be overwritten by a subsequent call to getenv(), setenv(), unsetenv(), or putenv()" (https://pubs.opengroup.org/onlinepubs/9699919799.2008edition...).

cryptonector · on Oct 17, 2023

"May", but really, it's best for `setenv()` to leak.

thayne · on Oct 17, 2023

Having a lock would still be better than the current situation. Especially if the lock was exposed so that programs that did mess with environ directly, or the pointer returned by getenv could hold the lock while doing so.

lazide · on Oct 17, 2023

Deadlock disaster that one.

woodruffw · on Oct 17, 2023

I think the missing piece here is how POSIX specifiers the environment: `getenv(3)` and `setenv(3)` are accessors for `environ`[1], which is just a pointer to some memory (over which walking is also specified[2]). That level of granularity means that it's pretty hard to add any locking here.

[1]: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1...

[2]: https://pubs.opengroup.org/onlinepubs/9699919799/functions/s...

Retr0id · on Oct 17, 2023

I wonder if you could get around this by giving each thread its own environment context, and synchronizing them, asynchronously.

gpderetta · on Oct 17, 2023

Just independent environments would be the best, like it was done for locale with uselocale. But it is a breaking change, would have to go through posix and will take forever anyway. Also, as environ is a public symbol, it has ABI implications.

tinus_hn · on Oct 17, 2023

Because there is no way to tell when a thread is done with the buffer, there is no moment when you can be sure you can manipulate it. The options are create a new copy and leak the old one or accept the chance of a segfault.

YetAnotherNick · on Oct 17, 2023

But getenv/setenv syscall could still be under lock, which I think most of the people would use. Walking over memory could be without lock and the program could see inconsistent values as is the current behaviour.

cryptonector · on Oct 17, 2023

It should never use locks. Solaris/Illumos' getenv() is lock-less. Every C library should copy that pattern.

aaronmdjones · on Oct 17, 2023

It's a library call, not a system call. You're right though, they could implement locking.

asveikau · on Oct 17, 2023

I think in libc, there's a lot of stuff where an interface is kind of broken from a thread perspective, and it could be implemented better without changing the interface, but often people generally do not.

I can't think of any examples offhand, but I often think it about thread-local storage. Eg. lots of interfaces have an _r() equivalent where you provide the buffer, but many people still call the unsafe one which is broken when there are threads... In my mind, the best way to do this would be to use static thread-local storage in the non-_r() one, and have it call the _r() one ... Sure that has overhead and isn't a perfect solution, but it's better than "bad". But a lot of these old functions don't necessarily get love.

Gibbon1 · on Oct 17, 2023

An a sane world creating a thread would set a global that causes non thread safe library functions to seg fault. Or maybe calling them from two different threads causes a seg fault. But just make it really obvious you're doing bad stuff.

asveikau · on Oct 17, 2023

I think that sounds completely insane. The unsafety is an emergent property of what's done in the function, and entirely dependent on usage. If you were doing very disciplined use of an unsafe call, it's harmless.

Perhaps this would be a good feature of an assert, or something that breaks in a debugger if it's attached. But I don't think that is reasonable for production.

lokar · on Oct 17, 2023

That won’t help. The APi is broken. It returns a pointer.

Sprocklem · on Oct 17, 2023

The pointer isn't guaranteed to point into `environ` directly. `getenv()` could copy the value to a thread-local, (dynamically-allocated?) buffer while holding the lock.

Edit: In hindsight, a dynamic buffer would require returning ENOMEM errors (which might lead to some unexpected failures), while a static buffer would limit the value length. I think you might be right about the API being broken.

lokar · on Oct 17, 2023

Libc and much of posix predate threads. There was no way to fix everything without changing the APIs, which they did in many cases but not all.

alexey-salmin · on Oct 17, 2023

Call getenv() in a loop and you run out of memory.

harerazer · on Oct 17, 2023

Then don't do that. Of all the footguns in POSIX/C programming, having to remember to free this is really not as bad as you seem to imply.

alexey-salmin · on Oct 17, 2023

You miss the point. If you have full control over when and how getenv is called, there's no issue to begin with. The problem is that you don't, as OP demonstrates. It's perfectly natural to call getaddrinfo in a loop.

We need a new API which is not broken like in NetBSD, and a multi-year migration of all core libraries to it. Well a pity it wasn't started years ago though, could've been 95% done by now.

thayne · on Oct 17, 2023

And how would you free it? The current posix API doesn't have any way to reliably free the result returned by `getenv`.

Sprocklem · on Oct 17, 2023

I was suggesting that the buffer be invalidated by each subsequent call – like some other libc functions' internal buffers – although, as I noted in the edit this would need `getenv()` to be able to indicate errors (specifically ENOMEM). It currently cannot do this as currently described, because NULL is used to indicate an absent variable.

You could also require callers free the returned memory when they're done, but that would be another change of API.

alexey-salmin · on Oct 17, 2023

> I was suggesting that the buffer be invalidated by each subsequent call

This would break very simple single-threaded programs that e.g. print two env vars in one printf call.

lokar · on Oct 17, 2023

The solution to all problems like this was decided years ago: _r
You provide the storage and free it

The problem is these non-direct uses. They each need to switch to •_r and manage the buffer, or offer _r versions themselves and sort of pass through the problem

Sprocklem · on Oct 17, 2023

Of course, *_r is a better option, but the existing API is used so pervasively that it needs to be made thread-safe to actually avoid thread-unsafe code in, e.g, libraries.

lokar · on Oct 18, 2023

I don’t see how you can make:

- return a pointer - the library owns the allocation - the state is global and mutable

Thread safe

Sprocklem · on Oct 18, 2023

A number of libc functions return a pointer to an internal thread-local buffer, which is invalidated on subsequent calls. If the function copies the environment variable's value to such a buffer while holding the mutex controlling access to the global state, then the returned value is guaranteed to remain unaffected by other threads.

There are, however, other problems (discussed elsewhere in this thread) that complicate such an API in the context of getenv().

lokar · on Oct 18, 2023

Requiring you to hold a mutex to safely call the function is an API change

Sprocklem · on Oct 18, 2023

I was not suggesting the caller hold a mutex, but rather getenv(), which eould be transparent to the caller.

lokar · on Oct 20, 2023

Don’t you need C++11?

jrmg · on Oct 17, 2023

That makes a lot of sense. You’d need to snapshot environ when the lock was taken (when another thread could be accessing it!), which I imagine would be complicated. Although surely possible.

jclulow · on Oct 17, 2023

On at least some other operating systems, getenv(3C) and setenv(3C) are indeed thread-safe; e.g., on illumos: https://illumos.org/man/3C/getenv

We inherited our implementation from OpenSolaris, as did Oracle when they created their Solaris 11 fork. I expect at least some of the BSDs have probably fixed this as well.

o11c · on Oct 17, 2023

Just because it is documented as thread safe doesn't mean it actually is. They might have just not understood the problem (see e.g. the various indirect links to "please mandate a thread-local `getenv`").

`setenv` is a nasty beast, especially since the raw `environ` variable is also exposed (and is in fact the only way to enumerate the environment).

tjfontaine · on Oct 17, 2023

In my experience, because it is documented that’s the behavior I would expect.

That being said, I went to look[0] and it turns out it wasn’t a lie. [0]https://github.com/illumos/illumos-gate/blob/master/usr/src/...

hddqsb · on Oct 17, 2023

For the curious: They make getenv() thread-safe by intentionally leaking the old environment, which they argue is acceptable because the memory leak is bounded to 3x the space actually needed.

The getenv/setenv/putenv/environ API looks terrible on closer inspection -- it does not appear possible for an implementation to be safe, leak-free, and efficient.

o11c · on Oct 17, 2023

Just because they use locks doesn't mean it's automatically actually thread-safe.

The comment about "this is safe for people walking `environ`" is definitely a lie, for example, though the bug might be hidden on some common architectures with popular compilers in their default configuration.

asveikau · on Oct 17, 2023

I agree. I'm shocked this isn't there in 2023. Or even better, a rwlock allowing concurrent reads while serializing writes. Or some lock free algorithm for writes.

I'm pretty sure the Windows version of the environment calls has locking.

Historically you can access the environment via a global variable too, which would side-step locking schemes. But probably hardly anybody does that anymore.

jeroenhd · on Oct 17, 2023

SetEnvironmentVariable is thread safe on Windows, but their POSIX wrappers aren't. Windows has a better API design in this instance, and the guarantees by POSIX make it impossible to make a compliant implementation that can be used with threads.

asveikau · on Oct 17, 2023

> and the guarantees by POSIX make it impossible to make a compliant implementation that can be used with threads.

Can you be more specific than this? I kind of doubt it.

For example, seems to me you could write a setenv() that uses a lock-free algorithm or writes in a strategic way that won't result in a fault if getenv() or reading environ(7) runs concurrently, then say all bets are off for thread safety if you write via environ(7). That's safer than the status quo and I don't foresee it breaking POSIX.

jeroenhd · on Oct 17, 2023

Reading the spec again, I suppose it's possible to keep copies of environment variables around in memory without violating the spec, basically creating a copy for every getenv call that was modified since the last setenv/putenv. I thought the spec also specified that writes to the returned pointer would update the environment variables, but no such guarantee is given, that's just an implementation detail (and is disallowed by the API spec but good luck enforcing that).

The XSI spec does state that the result of getenv() may be overwritten by putenv() but that's not a strict requirement either.

You still risk programs and libraries expecting getenv to always return a pointer to *environ failing (I believe Go has an issue like that on MUSL).

On the other hand, the POSIX standard explicitly states that getenv() does not need to be reentrant (and therefore doesn't need to be thread safe) so any program relying on thread-safe getenv is already violating the API contract.

The rationale also seems to assume you can't make this stuff thread safe because of the standard implementation:

> The getenv() function is inherently not reentrant because it returns a value pointing to static data.

asveikau · on Oct 17, 2023

Modifying the buffer returned by getenv() seems like a terrible way to write back a value, because you could only replace it with a string with equal or shorter length. One of the problems setenv() solves is allocation.

It's important to note the difference between reentrant and thread safe. The most obvious implementation of getenv(), which would just loop through environ(7) and do a bunch of strncmp, can safely be re-entered, in that you could interrupt it and call it again and it would produce no ill effect. It just can't be overlapped with writes.

JoshTriplett · on Oct 17, 2023

There's a proposal for that: https://www.owlfolio.org/development/thread-safe-environment...

thayne · on Oct 17, 2023

That would be nice. What is the likelihood this will actually make it into libc implementations and/or become part of the posix standard?

raggi · on Oct 17, 2023

there are lots of programs that access `*environ` directly, so while this might be good, it wouldn’t solve all classes of the problem. There are also uses out there which are performance sensitive (and often just as if not more unsafe, such as holding pointers into the structure over long periods).

threaded programs should probably seriously consider retiring libc, but we don’t currently have a common ground replacement.

name related activities are one of the worst areas, contributing significantly to the glibc linkage and abi challenges, but also lacking sufficient standards for alternatives or even consensus to be built quickly.

dundarious · on Oct 17, 2023

Yes, on Linux you can even write `int main(int argc, char* argv[], char* envp[])`.

o11c · on Oct 17, 2023

`envp` is unsafe after `setenv` even in single-threaded programs though. So you really should use `environ`.

FridgeSeal · on Oct 17, 2023

The sarcastic answer to that would be something along the lines of “users should be aware of what they’re doing, and you should be more careful about calling those concurrently anyway/good-luck-have-fun”.

jrmg · on Oct 17, 2023

I’ve been searching around, and you can find a bunch of discussions about this online. Your ’sarcastic’ argument is basically the one I’ve seen in most places.

They could easily be made thread safe, but, paraphrasing, most arguments seem to come down to something like:

“setenv and getenv are POSIX functions, and not defined to lock. Just like many POSIX functions, they’ve _never_ been thread safe, and it’s an error to assume they are. Should we really start papering over client errors in use of a supposedly portable API, even though it’s working as specified? And if we make that choice pragmatically for this instance, should we be trying to do it for _all_ of POSIX? That’s impossible for some things, and would add complexity even where it’s not. For all these reasons, it’s better if these just stay dangerous like they’ve always been.”

wmf · on Oct 17, 2023

This is fine for a 1980s monolithic program but if you use any library that reads environment variables (like, ahem, libc!) you have to treat the whole library as non-thread-safe? Or keep track of the "color" of each library function?

jiggawatts · on Oct 17, 2023

This kind of historical baggage is one of the main reasons I now completely avoid C/C++ programming and won't touch it ever again. It's Rust or C# only for me from here on...

thayne · on Oct 17, 2023

The problem is that this effects higher languages too, because they often build on libc. And on some OSes, they don't have a choice, because the system call interface is unstable and/or undocumented).

For example in rust, multiple time libraries were found to be unsound if `std::env::set_env` was ever called from a multi-threaded program. See:

https://github.com/time-rs/time/issues/293 and https://github.com/chronotope/chrono/issues/499

https://github.com/rust-lang/rust/issues/27970

https://github.com/rust-lang/rust/issues/90308

Taniwha · on Oct 17, 2023

Yes it's worth remembering that the POSIX base came before threads became commonly available (or even had a standardised API)

thayne · on Oct 17, 2023

Maybe it is time to replace the POSIX base with something that is better suited to a multi-threaded environemnt.

pjmlp · on Oct 17, 2023

Which is already happening on non UNIX OSes.

Even on OSes that happen to use the Linux kernel like Android, those that insist on using the NDK and pretend it is like GNU/Linux, beyond the official supported use cases, end up bumping their heads against the wall.

FridgeSeal · on Oct 17, 2023

I’ll go one further and say that maybe it’s time we had OS’/kernel/API base that isn’t just better suited, but is explicitly designed for the massively multithreaded, ludicrously fast, massively concurrent hardware we have in spades these days.

Alas I am not as OS dev, I have not the skills or understanding to not how to build that, or what this would involve, but I do think it’s clear that what we have at the moment isn’t as well suited as it could be. Io_uring / Direct-IO seem to be better suited though.

j16sdiz · on Oct 17, 2023

For libc-developer, they could do better.

For the rest of us, I guess this is the best option. We can do wrappers, but, yak!

dataflow · on Oct 17, 2023

That's why getenv_s et al were added.

gpderetta · on Oct 17, 2023

Even if the api was perfect and used locks and returned memory to be managed by the caller, it would still be hard to use safely in a multithreading environment as long as the env is a process global property.

cryptonector · on Oct 17, 2023

On Solaris/Illumos `putenv()` and `getenv()` are lock-less and really fast.

Basically, if you `putenv()`, you commit to never freeing that memory.

xy2_ · on Oct 17, 2023

putenv is not lock free: https://github.com/illumos/illumos-gate/blob/master/usr/src/...

cryptonector · on Oct 17, 2023

Ay, yes. Though it could be made lock-less.

forrestthewoods · on Oct 17, 2023

If I were King I would ban environment variables from the OS entirely. Global mutable state is the root of all evil! Globals are evil evil evil and the modern reliance on bullshit environment variables is a plague upon reliability.

Kill it all with fire.

bayindirh · on Oct 17, 2023

Well, environment variables are not "global" globals. They are just my globals, or my post-it notes for some variables. Because they are not per-user even. They are per user session.

10 processes can have completely different set of values for the same environment variables, because they are in their own environments, and apparently, that's useful.

There are foot guns, and there are unintentional consequences of implementation and design details. This is why we patch, improve and rewrite our software over time. To iron out these kinks.

Fire is also have a tendency to cause collateral damage. So use both fire and environment variables responsibly, and world will be a better place.

gpderetta · on Oct 17, 2023

They are dynamically scoped variables. Very powerful, but only a slight step above globals.

gpderetta · on Oct 17, 2023

Should we also get rid of filesystems? Databases? All form of RPC?

forrestthewoods · on Oct 17, 2023

I definitely think a lot of filesystem access is a code smell and probably not the right thing. That one causes me a lot of pain. But that’s largely because I work in games and you really need to use the Unity/Unreal/whatever asset management system instead of direct file system access.

I’ve got a small build system and the first thing it does is nuke PATH to empty. It’s glorious. No more grabbing random shit from a big blob with untracked dependencies that varies wildly by system!

I could easily live my entire life without environment variables. They’re just a fundamentally bad idea. Every program that foolishly uses environment variables can be replaced by a better program that takes a config file or arglist.

nikanj · on Oct 17, 2023

Honestly sometimes I think the answer is yes. Imagine how happy we could be, and how many fewer problems we would have. Add printers to that list and you're describing a paradise.