Status Update - June 2026
By: Ian McCormack, Molly MacLaren, Joshua Sunshine
We are building BorrowSanitizer: an LLVM-based instrumentation tool for finding violations of Rust’s aliasing model in multilanguage applications. If you are new to the project, we recommend checking out the introduction and our first status update before continuing.
This month, we:
- Added support for wildcard provenance, expanding our test suite to cover 93% of relevant tests from Miri, up from 85% last month.
- Implemented our initial deferred reference counting garbage collector.
- Upgraded our diagnostics to point to relevant user code.
Wildcard Provenance
One of our biggest gaps in BorrowSanitizer’s coverage of Tree Borrows was support for wildcard semantics. We need to support this feature to accommodate programs that use integer to pointer conversion.
For context, when an integer is cast into a pointer, it is somewhat unclear what provenance it should have. By default, Miri runs in “strict provenance” mode, where pointers cast from integers trigger undefined behavior on dereference.
You can also enable permissive provenance, which attempts to validate accesses through these pointers in an “angelic” way. When a pointer is cast into an integer, its provenance is added to a global list of “exposed” values. Then, pointers cast from integers receive “wildcard” provenance values.
If an access using a wildcard pointer falls within the bounds an existing allocation identified by an exposed provenance value, then Miri will validate the access against that allocation in the most permissive way possible. Miri will only report undefined behavior if none of the existing permissions in the stack or tree could permit the access. You can read more about what this means in terms of Stacked and Tree Borrows here.
BorrowSanitizer assigns wildcard provenance to the results of inttoptr instructions, and pointers cast to integers via ptrtoint will have their provenance “exposed”. When we first enabled this feature, it caused several false negatives in our test suite. For example, Miri rejects the following program, but BorrowSanitizer accepted it (initially). You can see Miri’s behavior for yourself on the playground.
pub fn main() {
let mut x: u32 = 42;
let ptr_base = &mut x as *mut u32;
let ref1 = unsafe { &mut *ptr_base };
let ref2 = unsafe { &mut *ptr_base };
let int1 = ref1 as *mut u32 as usize;
let wild = int1 as *mut u32;
// ┌────────────┐
// │ │
// │ ptr_base ├───────────┐
// │ │ │
// └──────┬─────┘ │
// │ │
// │ │
// ▼ ▼
// ┌────────────┐ ┌───────────┐
// │ │ │ │
// │ ref1(Res)* │ │ ref2(Res) │
// │ │ │ │
// └────────────┘ └───────────┘
// Disables ref1.
*ref2 = 13;
// Tries to do a wildcard access through
// the only exposed reference ref1,
// which is disabled.
let _fail = unsafe { *wild };
}
It turns out the BorrowSanitizer also exposes the provenance of ptr_base when we have debug assertions enabled. Debug assertions include alignment checks, which transmute pointers into integers. At the moment, these are compiled to use intoptr:
#![allow(unused)] fn main() { (Pointer(..), Int(..)) => { // FIXME: this exposes the provenance, which shouldn't be necessary. bx.ptrtoint(imm, to_backend_ty) } }
The “ultimate” fix for this would be using LLVM’s ptrtoaddr, instead. It’s the same as ptrtoint, but it does not expose provenance. It is going to be a little while before Rust can support this feature, because relevant parts of LLVM’s provenance semantics (like the byte type) are still being stabilized. For the moment, we get around this limitation by disabling debug assertions, but we intend to upstream a workaround using getelementptr.
Garbage Collection
We also finished designing and implementing our garbage collector! If you are new to Rust’s aliasing model, this might seem like a bit of an odd design decision. Why would we need garbage collection to validate the semantics of a language with manual memory management?
It turns out that garbage collection is critical to the performance of this category of run-time checking. Every time that a new reference to an allocation is created, both BorrowSanitizer and Miri modify the allocation’s metadata to create a new permission for the allocation. New permissions take up space, making it incrementally more expensive to validate memory accesses.
Permissions are identified by pointer-level provenance metadata. If we can prove that a pointer’s provenance is no longer reachable in memory, then we can eliminate its associated state from allocation-level metadata. In terms of Tree Borrows, this means “pruning” nodes from an allocation’s tree.
Miri determines which provenance values are reachable using a tracing garbage collector. Periodically, Miri halts the execution of every thread, collects a global set of reachable provenance values in stack and heap memory, and then visits every tree to remove any of the nodes that it could not find. We cannot use this approach because we do not have a mechanism for tracking where provenance values are currently stored on the heap.
Instead, we implemented a “deferred reference counting” technique. This is a hybrid approach to garbage collection. It uses reference counting to track copies of values that are stored on the heap, and then periodically “stops the world” to scan for values on the stack. Any values with a reference count of zero that cannot be located are collected. In our case, the “values” that we are collecting are the nodes within each tree, which can be identified by provenance of pointers stored in memory.
We chose this approach for two reasons. First, it has performance benefits over either tracing or reference counting. We limit the scope of values that need to be scanned, and we avoid the overhead of adjusting reference counts for frequent stack operations. Additionally, unlike Miri, we do not track where provenance values are stored on the heap, which prevents us from using a tracing approach.
If you would like to learn more about our garbage collector, then take a look at its design document or the PR, which should be merged this week.
Diagnostics Update
Diagnostic reports should point to a location in user code so that bugs can be fixed locally, rather than deep in library code. This wasn’t always the case until this update!
BorrowSanitizer previously only retrieved the exact location where a bug was detected, which may frequently be unsafe library code. These libraries themselves (especially standard libs) don’t typically misuse unsafe; it’s more likely that unsafe user code misuses a library call.
By including the last call from user code in our reports, we can ensure they are actionable for fixing bugs in relevant local programs.
For example, in mixed_cell_deallocate.rs the aliasing violation diagnostic initially displayed a location inside std::alloc, one of its dependencies:
error: Undefined Behavior: deallocation through <957>(StrongProtector) at alloc179[0x4] is forbidden
--> /root/.rustup/toolchains/bsan/lib/rustlib/src/rust/library/std/src/sys/alloc/unix.rs:48:18
|
48 | unsafe { libc::free(ptr as *mut libc::c_void) }
|
= help: the accessed tag <957>(StrongProtector) has state Frozen which forbids this deallocation (acting as a child write access)
While this is where the violation precisely occurs, more complex programs may be unsafely freeing memory in multiple other contexts, making it difficult to identify where this free is called from.
Therefore, we roll back the stack until we’re outside of library code to provide the last relevant call as well as the exact source. This way, we can see both where and why the aliasing violation occurs:
error: Undefined Behavior: deallocation through <957>(StrongProtector) at alloc179[0x4] is forbidden
--> /workspaces/bsan/tests/miri-tests/fail/aliasing/both_borrows/mixed_cell_deallocate.rs:13:14
|
13 | unsafe { alloc::dealloc(x as *const _ as *mut T as *mut u8, layout) };
|
= note: the above line calls library code, where the bug was detected:
--> /root/.rustup/toolchains/bsan/lib/rustlib/src/rust/library/std/src/sys/alloc/unix.rs:48:18
|
48 | unsafe { libc::free(ptr as *mut libc::c_void) }
|
= help: the accessed tag <957>(StrongProtector) has state Frozen which forbids this deallocation (acting as a child write access)
This patch to diagnostics not only matches the source lines provided by Miri’s diagnostic reports, but retains the extra context of where the bug was caught. These details from library code are still helpful to understand what went wrong in memory at a fine-grained level.
Conclusion
That’s all for this month. In July, we will be expanding our testing to include larger, more resource-intensive programs, so that we can stress-test our garbage collector.