- Feature Name: unsized_locals
- Start Date: 2017-02-11
- RFC PR: rust-lang/rfcs#1909
- Rust Issue: rust-lang/rust#48055
Summary
Allow
Have repeat expressions[T]
slice.
Provide some optimization guarantees
Motivation
There are 2 motivations for this RFC:
- Passing unsized values, such as trait objects, to functions by value is often desired. Currently, this must be done through a
Box<T>
with an unnecessary allocation.割当
One particularly common example is passing closures that consume their environment
fn takes_closure(f: FnOnce()) { f(); }
But today you have to use a hack, such as takingBox<FnBox<()>>
.
- Allocating確保するa runtime-sized variable変数、ストレージon the stack is important for good performance in some use-cases - see RFC #1808, which this is intended to supersede.
Detailed design設計(する)
Unsized Rvalues - language言語
Remove the rule that requires all locals and rvalues to have a sized type. Instead, require the following:
- The following expressions式must always return a Sized type:
- Function calls,呼び出しmethod calls,呼び出しoperator演算子expressions式
- implementing実装するunsized return values for function calls呼び出しwould require the called呼び出しfunction to do the alloca in our stack frame.
- implementing
- ADT expressions式
- see alternatives
- cast expressions式
- this seems like an implementation実装simplicity単純さ、簡単さthing. These can only be trivial casts.
- this seems like an implementation
- Function calls,
- The RHS of assignment代入expressions式must always have a Sized type.
- Assigning代入するan unsized type is impossible because we don't know how much memory is available at the destination.行き先、目的地This applies適用するto ExprAssign assignments代入and not to StmtLet let-statements.
- Assigning
This also allows&move
pointer was passed (a (by-move-data, extra)
pair). This also means that methods takingself
by value are object-safe, though vtable shims are sometimes needed to translate the ABI (as the callee-side intentionally does not pass extra
to the fn in the vtable, no vtable shim is needed if the vtable function already takes
For example:
struct StringData {
len: usize,
data: [u8],
}
fn foo(s1: Box<StringData>, s2: Box<StringData>, cond: bool) {
// this creates a VLA copy of either `s1.1` or `s2.1` on
// the stack.
let mut s = if cond {
s1.data
} else {
s2.data
};
drop(s1);
drop(s2);
foo(s);
}
fn example(f: for<'a> FnOnce(&'a X<'a>)) {
let x = X::new();
f(x); // aka FnOnce::call_once(f, (x,));
}
VLA expressions式
Allow[T]
with the length being evaluated
extern "C" {
fn random() -> usize;
}
fn foo(n: usize) {
let x = [0u8; n]; // x: [u8]
let x = [0u8; n + (random() % 100)]; // x: [u8]
let x = [0u8; 42]; // x: [u8; 42], like today
let x = [0u8; random() % 100]; //~ ERROR constant evaluation error
}
"captures a variable"[T]
because it is simple, easy to understand, and introduces no type-checking complications.
The last error message could have a user-helpful note, for example "extract
Unsized Rvalues - MIR
The way this is implementedUse
or a Repeat
and both can be translated easily.
Unsized locals can never be reassigned within a scope. When first assigning
MIR construction
Guaranteed保証する Temporary一時的な Elision
MIR likes to create lots of temporaries
TODO: add description of problem & solution.
How We Teach This
Passing arguments
The "guaranteed
Drawbacks
In Unsafe code, it is very easy to create unintended temporaries,
unsafe fn poke(ptr: *mut [u8]) { /* .. */ }
unsafe fn foo(mut a: [u8]) {
let ptr: *mut [u8] = &mut a;
// here, `a` must be copied to a temporary, because
// `poke(ptr)` might access the original.
bar(a, poke(ptr));
}
If we make [u8]
be Copy
, that would be even easier, because even uses of poke(ptr);
after the function calla
.
And even if it is not as easy, it is possible to accidentally create temporaries
Unsized temporaries
Alternatives
The bikeshed
There are several alternative
- The RFC choice,
[t; φ]
has type[T; φ]
ifφ
captures no variables変数、ストレージand type[T]
if φ captures a variable.変数、ストレージ- pro: can be understood using "HIR"/resolution only.
- pro: requires no additional追加のsyntax.文法
- con: might be confusing at first glance.
- con:
[t; foo()]
requires the length to be extracted抽出するto a local.
- The "permissive" choice:
[t; φ]
has type[T; φ]
ifφ
is a constexpr, otherwiseさもなければ[T]
- pro: allows許可する、可能にするthe most code
- pro: requires no additional追加のsyntax.文法
- con: depends on what is exactly正確にa const expression.式This is a big issue because that is both non-local and might change between rustc versions.
- pro: allows
- Use the expected type -
[t; φ]
has type[T]
if it is evaluated評価する(される)in a context文脈、背景that expects that type (for example[t; foo()]: [T]
) and[T; _]
otherwise.さもなければ- pro: in most cases, very human-visible.
- pro: requires no additional追加のsyntax.文法
- con: relies on the notion of "expected type". While I think we do have to rely on that in the unsafe code semantics of
&foo
borrow expressions式(as in, whether a borrow is treated取り扱うas a "safe" or "unsafe" borrow - I'll write more details sometime), it might be better to not rely on expected types too much.
- use an explicit明示的なsyntax,文法for example
[t; virtual φ]
.- bikeshed: exact syntax.文法
- pro: very explicit明示的なand visible.
- con: more syntax.文法
- bikeshed: exact syntax.
- use an intrinsic,
std::intrinsics::repeat(t, n)
or something.- pro: theoretically minimizes changes to the language.言語
- con: requires returning unsized values from intrinsics.
- con: unergonomic to use.
- pro: theoretically minimizes changes to the language.
Unsized ADT Expressions式
Allowing unsized ADT expressions
let len_ = s.len();
let p = Box::new(PascalString {
length: len_,
data: *s
});
However, without some way to guarantee
Copy Slices
One somewhat-orthogonal proposal that came up was to make Clone
(and therefore Copy
) not depend on Sized
, and to make [u8]
be Copy
, by moving the Self: Sized
bound
pub trait Clone {
fn clone(&self) -> Self where Self: Sized;
fn clone_from(&mut self, source: &Self) where Self: Sized {
// ...
}
}
That would be a backwards-compatability-breaking change, because today T: Clone + ?Sized
(or of course Self: Clone
in a trait context,Self: Sized
) implies that T: Sized
, but it might be that its impact is small enough to allow
Unresolved questions
How can we mitigate the risk of unintended unsized or large allocas? Note that the problem already exists today with large structs/arrays. A MIR lint against large/variable stack sizes would probably help users avoid
How do we handle truely-unsized DSTs when we get them? They can theoretically be passed to functions, but they can never be put in temporaries.
Accumulative allocas (aka 'fn
borrows) are beyond the scope of this RFC.
See alternatives.