Well, of course slices work that way. Think about what happens if you have a ref...

friendzis · on Aug 30, 2021

> This is safe, but confuses some people.

Rust has created a weird perception that memory safety equals safety. Language is a tool and it should work with me: it is extremely important that my understanding of what the program should do aligns with what it actually does.

The way you describe go's behavior is "takes snapshot of the underlying data", which usually means "deep copy container". Taking a pointer/reference usually means quite an opposite. So it is "safe" in a sense that the pointer points to valid data, but is "incorrect" in a sense that it does wrong thing without warning.

Sure, one could argue that value-returning modification functions are a giveaway of invalidated data. But this is not C, go has reference counting and instead of "forcing" underlying array to maintain the same address it just keeps original pointer pointing to dereferenceable, but wrong data.

feffe · on Aug 30, 2021

This is how I think about Go slices (may help other understand them).

A slice itself is just a window into a backing array of fixed size. The slice carries three data members. The pointer to the backing array and its remaining capacity and the length of the slice data.

Typically slices are passed around by value but you can take their address and modify a "shared" slice.

The built-in append() returns a new slice by value.

What happens is simply that when appending data to a slice and there is no room in the backing array, a new backing array is allocated that the returned slice points into. The old "input" slice to append is still intact and if some code has access to it, it will look at data stored in the old backing array.

I've constructed similar utility types in C and find them quite convenient. It's very convenient to have the distinction between the backing memory (array) and a slice viewing a portion of it instead of just a dynamic array.

masklinn · on Aug 30, 2021

> I've constructed similar utility types in C and find them quite convenient. It's very convenient to have the distinction between the backing memory (array) and a slice viewing a portion of it instead of just a dynamic array.

Of course that's convenient, the issue of Go's slices is that they act as both a dynamic array and a slice viewing a portion of one. The two uses conflict with one another, and the interactions are full of traps.

anderskaseorg · on Aug 30, 2021

If it were true in the typical sense that “append() returns a new slice by value”, then you would expect to be able to mutate the old slice and the new slice independently from each other. But in reality, you can only do this if append() decided to reallocate, which only happens at some implementation-defined exponential pattern of sizes.

    package main
    
    import "fmt"
    
    func main() {
            a := []int{0}
            for i := 0; i < 40; i++ {
                    b := append(a, 0)
                    a[i] = 1
                    fmt.Printf("%d ", b[i])
                    a = b
            }
    }

→ 0 0 1 0 1 1 1 0 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1

codeflo · on Aug 30, 2021

Does that mean that a) the pointer to the single element is only invalidated on “append” if the slice has no more capacity (this is how C++ vectors work), or b) does the fact that there’s an active reference into the slice always cause a reallocation (copy-on-write style)? If it’s the former, we’re literally in the C++ iterator invalidation nightmare, only without the debugging tools.

_ph_ · on Aug 30, 2021

The array is only reallocated when it runs out of space. There is no magic about the "old" array. The GC just won't collect it, as long as a pointer to it exists.

I don't think there are many reasons, if any at all, to keep a pointer to an array element of a slice around in Go. Usually I only get the address of an array element only when passing it to some C code or doing some low level manipulation, but then I don't keep the pointer around. In Go code you usually just keep the slice object - which contains the necessary pointer anyway.

dgb23 · on Aug 30, 2021

Thank you for clarifying this. I was scratching my head reading this discussion, wondering why one would do this in the first place.

dthul · on Aug 30, 2021

Without knowing much Go I believe it's neither a nor b. The pointer to the single element will always stay valid, no matter whether reallocation happens or not (and having a pointer doesn't influence whether reallocation happens or not). Re-allocation might be a confusing word here because afaik it's actually always a new allocation (the old one is not touched) and only if there are no more pointers to the old allocation will the next GC cycle deallocate it. So there is never iterator invalidation like in C++ but of course you still need to be careful because you might accidentally share or not share the same underlying data.

throwaway894345 · on Aug 30, 2021

This is correct. A pointer like `p := &x[0]` will always point at the original backing array even if an append on the slice causes the slice to allocate a new backing array. This means that you can update `x[0]` on the new slice without changing `*p`.

https://play.golang.org/p/Hl58VW-Yvhn

friendzis · on Aug 30, 2021

AFAIK, neither.

Since slice API is pass-by-value, in theory ANY method will invalidate the pointer. In practice only resizing methods actually NEED to reallocate the underlying array, but magic can happen. However, refcounting will make sure that a previously underlying array having pointers to it will remain allocated. This means that 1. pointers to single elements will always dereference 2. slice structure modification can leave pointers pointing to stale data

_ph_ · on Aug 30, 2021

Yes, this is entirely correct.

Zababa · on Aug 30, 2021

> Rust has created a weird perception that memory safety equals safety.

I think it's a bit more than that. They're also riding on the static typing trend that's happening right now, so type-safety is also part of the equation. From the website:

> A language empowering everyone to build reliable and efficient software.

> Reliability: Rust’s rich type system and ownership model guarantee memory-safety and thread-safety — enabling you to eliminate many classes of bugs at compile-time.

That means that people like me that still have a hard time with C and C++ can build efficient software using the same workflow as I'm used to in my usual "web languages" (Python and JS mostly).

throwaway894345 · on Aug 30, 2021

> They're also riding on the static typing trend that's happening right now, so type-safety is also part of the equation

I think the "static typing trend" is a product of Rust and Go showing people that static typing doesn't have to be cumbersome like it was in 90s-00s Java, C++, and C#. Indeed, I suspect that the quality of life improvements that Java, C#, and C++ made also improved the stock price of static typing (and building on that foundation, things like TypeScript are exposing JavaScript developers to the utility of types). Which is to say, static typing isn't an empty trend or fad (no idea if that's your intended meaning) but rather people were previously averse to static typing because the mainstream statically typed languages weren't ergonomic and people assumed that the bad ergonomics was caused by static typing--now we have many mainstream languages that show that this isn't the case.

Zababa · on Aug 30, 2021

That wasn't my intended meaning, I don't think static typing is a "fad". I think the "new" typed languages (Go, Rust, Typescript) are more ergonomic than 90s-00s Java, C++, C#, as you said. This is also forcing them to improve, with features like type inference, sealed classes, records. I also think that the combination of gradual typing and type inference is playing a big role in the adoption.

However, I called static typing a "trend", and I'll try to explain why. I think attempts to type Python, JS, Ruby and the popularity of Go and Rust are the natural consequences of people departing from the Java/C++ ecosystem 10-20 years earlier (for good reasons). Now they are rediscovering the good parts of this ecosystem (ease of deployment with binaries/fat jars, static typing, performance). Since Twitter, Github, Youtube, Shopify, Instagram, etc have all that code around, they are going to either improve it, or try to migrate from it. For example, Shopify is working on a compiler to native for Ruby based on LLVM https://sorbet.org/blog/2021/07/30/open-sourcing-sorbet-comp.... Instagram is working on a performance-oriented CPython fork https://github.com/facebookincubator/cinder. Twitter, from what I understand, went back to Java, going through Scala first (which is another example of "better type system"). KhanAcademy is migrating services from a Django monolith to Go services https://blog.khanacademy.org/half-a-million-lines-of-go/. Whatsapp even had a project to do a statically typed "Erlang 2".

The "trend" here is that some companies that use "new" dynamic languages in the 00s are now very large companies that have enough money to invest in language, tooling and things like that.

throwaway894345 · on Aug 30, 2021

Makes sense. Thanks for clarifying!

tsimionescu · on Aug 30, 2021

The problem here is identical to the problem of pointers to array elements in C after a 'realloc', except that Go at least guarantees that you're not going to modify some other object's memory.

Of course, since append neither guarantees nor prevents a copy, the semantics of modifying a value through a pointer to a slice element after an append are unspecified, so it is not a useful construct.

masklinn · on Aug 30, 2021

> except that Go at least guarantees that you're not going to modify some other object's memory.

Or that you’re way off in UB (UAF) land.

heleninboodler · on Aug 30, 2021

> The way you describe go's behavior is "takes snapshot of the underlying data", which usually means "deep copy container"

No, there is no mention of a "snapshot". You get a reference to the current backing array, which may or may not continue being used by the slice (depending on reallocations). You're pointing to the live slice backing array, and the values in it may change if someone else is manipulating the slice, up to the point where the slice backing array must be reallocated, at which point you'll continue pointing to the old backing array and be keeping it from getting GC'd.

db48x · on Aug 30, 2021

And that’s really the problem with it. If you want to ensure that you have exclusive access to the element(s), then you have to explicitly copy them first or you get silent data corruption.

And if you want to ensure that multiple things have access to the elements, then you have to avoid reallocations or you get silent data loss.

No matter what you’re doing, a pointer to an element of an array or slice is usually the wrong thing in Go. The language would be better off without them.

sly010 · on Aug 30, 2021

> it does wrong thing without warning

It's not without warning, it's a well documented behavior.

Just think of slices as "immutable", "pass-by-value" data structures (with a relatively efficient implementation) and everything falls into place.

Mutating them in any way is actually a special case that you do only for performance reason (i.e. you can pre-allocate and fill if you know the size ahead of time) but - as always - you try to keep those abstracted away and to the minimum.

masklinn · on Aug 30, 2021

> It's not without warning, it's a well documented behavior.

Ah yes, the usual excuse for it being fine that C APIs are completely broken and half of them can not be used correctly.

> Just think of slices as "immutable", "pass-by-value" data structures (with a relatively efficient implementation) and everything falls into place.

Except when they don't because `append` itself amortises allocations, which means if you treat slices as immutable and pass by value you will end up with slices sharing a backing array with leftover capacity and stomping on one another's data.

> Mutating them in any way is actually a special case that you do only for performance reason

Mutating them is literally what the normal Go API usage has you do. If you want to avoid mutating slices you need to write this abortion:

    s1 := append(append([]int(nil), s0...), item)

and if you do that in a loop, you get to feature on https://accidentallyquadratic.tumblr.com

sly010 · on Aug 30, 2021

Ok, I think I just understood the problem. The following would cause a problem:

   s0 := append([]string{}, "zoo")
   sa := append(s0, "foo")
   sb := append(s0, "bar") // overwrites sa[1]

Go is truly unique in this sense, and you could not actually treat go slices as immutable structures. Point taken.

foldr · on Aug 30, 2021

> The way you describe go's behavior is "takes snapshot of the underlying data", which usually means "deep copy container".

There’s no need for Go to copy anything in the circumstance the OP described. It just doesn’t shrink the underlying array.

hsn915 · on Aug 30, 2021

Go tells you very explicitly how it resizes slices by forcing you to write this:

    slice = append(slice, item)

By merely typing this all the time when you add elements, you intuitively understand that appending can potentially re-allocate the slice data in a totally different location, so any pointers you have taken before the resize are not guaranteed to be pointing to items in the new slice; just the old slice.

masklinn · on Aug 30, 2021

> Go tells you very explicitly how it resizes slices by forcing you to write this:

No, what most readers intuit from that is that `append` performs no mutation and that this is fine:

    s2 := append(s1, item)

because it looks very much like, say,

    (def s2 (conj s1 item))

and often it will look like it works, especially at the smaller sizes, or if you never modify (or even use) s1.

Except it's absolutely not fine.

That's why other language separate slices and vectors and avoid confusing two objects which have different behaviours and uses even if their representation is very similar.

hsn915 · on Aug 31, 2021

This is fine:

    s2 := append(s1, item)

It's only not fine if you some how assume that s2 and s1 will always refer to the same data.

> other language separate slices and vectors

Go has a separate type for arrays, which is a value type

    var arr [10]int

masklinn · on Aug 31, 2021

> This is fine:

No, it is not.

> It's only not fine if you some how assume that s2 and s1 will always refer to the same data.

It's not fine if you assume anything about the interaction of s1 and s2. It's not fine if you assume they do alias, it's not fine if you assume they don't.

There is, fundamentally, no situation in which that construct is anything other than a footgun. `append` should only ever be used with the same slice on the LHS and RHS[0], or a brand new slice object constructed for the occasion on the RHS.

> Go has a separate type for arrays, which is a value type

An array is neither a vector nor a slice.

The issue is that Go's slices serve as both a vector and an actual slice, and the union of these interfaces creates footguns which don't exist in either.

[0] IFF that RHS is a either a non-parameter local, or a pointer to a slice

jatone · on Aug 31, 2021

the idea that append performs no mutation is fairly insane. its name implies a mutation.

masklinn · on Aug 31, 2021

No, it does not. Appending is the addition of a suffix, it does not say anything about how it works.

Example: https://hackage.haskell.org/package/bytestring-0.11.1.0/docs...

jatone · on Sept 4, 2021

if you add a suffix, you've changed the value. that is a mutation.

37ef_ced3 · on Aug 30, 2021

In Go, a []int ("slice of int") is just a C struct like this, passed by value:

  struct intSlice {
      int* addr;
      int len;
      int cap;
  };

The memory at addr is not owned by the slice. All the slice operations are simply notation for manipulating the struct. Go's garbage collection makes the whole thing work well.

This can be confusing if you're used to C++'s std::vector (which owns the memory) or Python's slices. Go's slices are a shallow pointer/length system exactly like is used in C all the time. For example:

  void sort(int* addr, int len);

becomes

  func sort(a []int)

A Go slice is just a formalization of C's pointer/length idiom, with terse notation for manipulation.

benibela · on Aug 30, 2021

In Pascal, there are no slices

No slices, no problems

If you need to work with a part of a string, you can make two ordinary integer variables for offset and length

jayar95 · on Aug 30, 2021

benibela · on Aug 30, 2021

Actually, Pascal has slices

They are just so obscure, I forgot about them and no one uses them. No users, no problems

They are not part of the normal type system. You cannot declare a variable of a type slice. Nor a field. But when a parameter of a function is an (open) array, you can call the function with a slice of an existing array

That avoids most problems

The backing array exists when the function is called, and the function cannot store the slice, so the slice cannot outlive the array. It is like the function borrows the array. Only problem is if the function gets another reference to the array, through a global variable or something, and resizes it