Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert "Foundation: fix missing string terminator" #2676

Merged
merged 1 commit into from
Feb 22, 2020

Conversation

compnerd
Copy link
Member

This reverts commit f3a6f50. It also
fixes the underlying issue of the heap corruption.

There is no guarantee that the underlying storage for an allocation is
done by malloc. It is possible that the size may be small enough to
be pushed down to a stack allocation, which cannot be released by
free. Buffers record how they are allocated and adjust the release
accordingly. The safe way to release the memory is to use deallocate.

Although the initial approach fixed the heap corruption, it did so at
the single site, but this is a problem with a public interface. Use the
deallocate method on the buffer when releasing the memory to ensure
that the correct mechanism is used for releasing the resources.

Thanks to Lily Vulcano for pressing me to investigate this further and
root cause the failure in the public interface!

@compnerd
Copy link
Member Author

@swift-ci please test Linux platform

@compnerd
Copy link
Member Author

CC: @millenomi @spevans

@millenomi
Copy link
Contributor

cc @weissi

@spevans
Copy link
Contributor

spevans commented Feb 17, 2020

Good find. Looking through the source we seem to use free quite a lot, eg:

internal func __NSDataInvokeDeallocatorFree(_ mem: UnsafeMutableRawPointer, _ length: Int) {
    free(mem)
}

Maybe we should have an audit/cleanup of malloc/free usage across all the code

Copy link
Contributor

@weissi weissi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can just Change meaning of freeWhenDone given its documentation but @millenomi ’s call

@@ -1359,7 +1359,7 @@ extension NSString {
// just copy for now since the internal storage will be a copy anyhow
self.init(bytes: bytes, length: len, encoding: encoding)
if freeBuffer { // don't take the hint
free(bytes)
bytes.deallocate()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this is documented to call free: https://developer.apple.com/documentation/foundation/nsstring/1413830-initwithbytesnocopy

I think we should replicate this comment and the caller should make sure that the memory is allocated using malloc. WDYT?

Copy link
Contributor

@millenomi millenomi Feb 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So:

  • On Linux and Darwin, free() is ultimately compatible with the allocation done by CF, ObjC and Swift. This is sort of a non-issue on those platforms.
  • On other platforms, there can be a distinction.

This has two sides: a contract side and an implementation side.

On the contract side: in ObjC, the original API here takes a void * and the language does not mandate how you produce that pointer, only that it be valid. It makes sense here to specify precisely the deallocation strategy.

But. This is Swift. And in Swift, this type is UnsafeMutableRawPointer — a type that does have a language-provided default deallocation facility. And in this language, since there are far fewer guarantees wrt the memory layout of language constructs*, the vast majority of these pointers will be produced by the .allocate(byteCount:alignment:) method of that class or the equivalent bound variations in the non-Raw pointers.

My opinion here is that we can deviate from Darwin Foundation as follows:

  • .deallocate() will eventually call through to free() on Linux and Darwin. We are going to rely on this property, for now, or conditionality the change away from free() on these OSes, so long that it interoperates correctly with buffers allocated with .allocate(…).

  • On Windows and other OSes where it's not guaranteed that .allocate(…) use malloc, we mandate that freeWhenDone: true use .deallocate() instead of free(). This puts an onus on the caller to use .allocate(…) rather than use OS facilities directly. I think this is a reasonable requirement to make: callers that choose a nonstandard allocation strategy on other OSes are already mandated to use freeWhenDone: false with these pointers and handle deallocation themselves; malloc() just happens to not be the standard on those OSes.

The contract side is straightforward-ish — we're just extending the existing contract to go from 'specifically free()' to 'the system-preferred deallocation strategy, which on Linux and Darwin is free()'. The implementation side puts a burden on implementors on those OSes where this is not the case — which means that existing setups don't regress.

I think I am in favor of making this change conditional on the OS, and we should note it in the release notes (in Docs/ReleaseNotes*.md) as a well-known deviation from Darwin behavior.

(* that is, you "can't just" take a pointer to a Swift.Array like you can for an array in C, and have strong guarantees about the memory layout of that object unless you are working hand in hand with the compiler and the standard library implementations.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@millenomi but what happens to all the code that was previously correctly matching malloc with freeWhenDone: true? That works today and will then break, correct?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@weissi I actually have code that passes raw buffer originated from C module (which uses malloc internally) to String(bytesNoCopy:... with "freeWenDone: true". And it crashes with this patch. I understand motivation though.

@compnerd I think we need to add one more detail to the docs. Official documentation doesn't clarify when freeWhenDone takes effect. It is unclear that on Windows we are allowed to do manual cleanup right after String init. As user, I'd rather think that String(bytesNoCopy:.. will need specified buffer alive until deinit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lxbndr Thanks for chiming in! I mean it is documented to free so I think it should free. If we think we need a Swift version of this, why not create a new String(bytesNoCopy:...deallocateWhenDone:) that would call deallocate?

I'm pretty strongly against just changing the contract.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lxbndr Are you saying that memory allocated with malloc but then passed as an UnsafeRawMutableBufferPointer then has deallocate() called on it crashes? Is this on windows? Because that sounds like another bug.

@weichsel Data has the following initialiser: init(bytesNoCopy bytes: UnsafeMutableRawPointer, count: Int, deallocator: Data.Deallocator) where the deallocator can be specified including a custom one passed with a closure so that probably offers a good way to do it although it needs deallocate() added into Data.Deallocator https://developer.apple.com/documentation/foundation/data/deallocator

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spevans Yes, this is on Windows. deallocateWhenDone:false together with manual free() works like a charm. Initial PR description now seems about a bit different case. It's about Swift buffers which can't be passed to free(). I think I need look closer on my case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@weissi Feels like band-aid, I agree. Approach with custom deallocator like in Data looks way better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lxbndr We're looking into a more comprehensive solution that will fix the issue for good, but a quick search on GitHub shows that using UMRP.alloc…(…) with freeWhenDone: is ubiquitous and prevalent.

@compnerd
Copy link
Member Author

@spevans yeah, we should definitely audit the calls

@spevans
Copy link
Contributor

spevans commented Feb 17, 2020

If an UnsafeMutableRawPointer records how it was allocated, is it possible to precondition on non-heap allocated ones that are passed in? I didnt realise that .allocate() might not use the heap tbh.

However Im not sure what is gained by using free as opposed to dealloc here if dealloc is always going to work. Although its not strictly what the documentation say I think id would be quite the gotcha unless free always complained about non-heap pointers.

@compnerd
Copy link
Member Author

Theres no way to ensure that a non-heap parameter is passed - that is an internal optimization (where a small enough buffer is alloca'ed.). The use of free is a carry over from the ObjC implementation, and just differs here due to the language as @millenomi mentions.

@compnerd
Copy link
Member Author

@compnerd
Copy link
Member Author

@swift-ci please test Linux platform

@compnerd
Copy link
Member Author

@swift-ci please test Linux platform

1 similar comment
@compnerd
Copy link
Member Author

@swift-ci please test Linux platform

This reverts commit f3a6f50.  It also
fixes the underlying issue of the heap corruption.

There is no guarantee that the underlying storage for an allocation is
done by `malloc`.  It is possible that the size may be small enough to
be pushed down to a stack allocation, which cannot be released by
`free`.  Buffers record how they are allocated and adjust the release
accordingly.  The safe way to release the memory is to use `deallocate`.

Although the initial approach fixed the heap corruption, it did so at
the single site, but this is a problem with a public interface.  Use the
`deallocate` method on the buffer when releasing the memory to ensure
that the correct mechanism is used for releasing the resources.

This is not specific to Windows but to any platform where a stack
allocated buffer or an aligned allocation may not be passed to `free`.

Thanks to Lily Vulcano for pressing me to investigate this further and
root cause the failure in the public interface!
@compnerd
Copy link
Member Author

@swift-ci please test Linux platform

@millenomi
Copy link
Contributor

@swift-ci please test Linux

@millenomi millenomi merged commit b883eca into swiftlang:master Feb 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants