To understand references in Rust, it will be beneficial to have knowledge on how the ownership system in Rust works.
Rust has an ownership system where only one variable can lead to a specific piece of data in the memory. The variable is called the owner of the data. The data can be stored either on the stack or the heap.
Variables that have a fixed size, like integers, float, or booleans are stored on the stack, while variables that can grow or change in size are stored on the heap.
Here, x
is a stack variable and when we assign y
the value of x
, the value of x
is copied on the stack and y
becomes the owner of the copied value, and when we print them we can see that their values are the same.
let x = 2;
let y = x;
println!("x: {x}");
println!("y: {y}");
x: 2
y: 2
Copying data on the stack is cheap, but for data stored on the heap, copying their values is expensive. So when we assign b
the value of a
, since there can’t be two pointers to the same data, b
becomes the new owner of the data, and the variable a
becomes invalid.
When we try to print the value of a
, we get an error.
let a = "hello".to_string();
let b = a;
println!("{a}");
error[E0382]: borrow of moved value: `a`
--> references/src/main.rs:8:13
|
4 | let a = "hello".to_string();
| - move occurs because `a` has type `String`, which does not implement the `Copy` trait
5 |
6 | let b = a;
| - value moved here
7 |
8 | println!("{a}");
| ^^^ value borrowed here after move
|
When data is stored on the heap, a pointer to that location is returned and pushed to the stack. The variable labels the address of the pointer on the stack.
You can read this article to learn more about ownership, the stack, and the heap. https://cudi.dev/articles/ownership_in_rust_explained
Sometimes you may want a variable to point to the data of another variable without taking ownership of that data. This is called borrowing.
What happens during borrowing is that a special pointer to the address of the owner is created and pushed to the stack and that pointer leads to the owner that leads to the data on the heap.
This special pointer is called a reference.
To make b
a reference to a
, we add an ampersand right in front of a
.
let a = "hello".to_string();
let b = &a;
Now we can make use of both variables, and print the values of both variables without any error.
println!("a: {a}");
println!("b: {b}");
a: hello
b: hello
Creating and using references comes with some rules called the borrowing rules.
The first one is that a reference is always valid. When we move ownership of the variable to the variable c
. Variable a
and all its references become invalid and we can no longer use them.
let a = "hello".to_string();
let b = &a;
let c = a;
When we try to use the variable b
, we get an error.
println!("{b}");
error[E0505]: cannot move out of `a` because it is borrowed
--> references/src/main.rs:8:11
|
4 | let a = "hello".to_string();
| - binding `a` declared here
5 |
6 | let b = &a;
| -- borrow of `a` occurs here
7 |
8 | let c = a;
| ^ move out of `a` occurs here
9 |
10 | println!("{b}");
| --- borrow later used here
We can also pass ownerships and references in function parameters. To specify if a parameter is a reference we add an ampersand in front of the data type.
fn main() {
let a = "hello".to_string();
}
fn takes_reference(s: &String) {
//
}
fn takes_ownership(s: String) {
//
}
If we pass a into the function that takes ownership, the pointer is moved to the parameter of the function, and a
becomes invalid. We can no longer get or use a reference of a
.
let a = "hello".to_string();
let b = &a;
takes_ownership(a);
// both statements are invalid
println!("{b}");
println!("{a}");
But if we replace it with one that takes a reference, we add a reference of a as the argument, and a still remains valid and we can still make use of it after the function.
let a = "hello".to_string();
let b = &a;
takes_reference(&a);
// both statements are now valid
println!("{b}");
println!("{a}");
There are two kinds of references;
A mutable reference is used when you want to mutate the variable’s data. Before creating a mutable reference the variable needs to be marked as mutable by adding mut
before the variable name.
To make a reference mutable add mut
after the ampersand.
let mut v = vec![0, 1, 2];
// mutable reference of v
let u = &mut v;
Here we have a function that mutates a vector of integers and pushes 3 onto the vector. In its parameter a
, we specify that a
should be a mutable reference of a vector.
fn mutates_vector(a: &mut Vec<i32>) {
a.push(3)
}
And after we mutate v
, a still has ownership of the vector and we can print out its value.
fn main() {
let mut v = vec![0, 1, 2];
mutates_vector(&mut v);
println!("vector v: {v:?}");
}
fn mutates_vector(a: &mut Vec<i32>) {
a.push(3)
}
vector v: [0, 1, 2, 3]
The second rule of borrowing is, we can have either as many immutable references or only one mutable reference. You can’t have a mutable reference and any other reference.
When there’s a mutable reference, all other references become invalid and we can’t use them.
If we try to print the value of c
, we get an error. But we can use the mutable reference d
since, it is the last reference.
let mut a = vec![0, 1, 2];
let b = &a;
let c = &a;
let d = &mut a;
d.push(3);
We can have multiple mutable references, as long as one is not used after another is created. Here we have mutable references d
and e
, mutating the vector a, as long as any other reference created before e is not being used, our program compiles and runs successfully.
let mut a = vec![0, 1, 2];
let d = &mut a;
d.push(3);
let e = &mut a;
e.push(4);
References can’t outlive the owner. Here we have a variable s
, created at the outer scope, and at the inner scope, we have a variable t
which is an integer 5, we then make s
a reference to t
.
let s;
{
let t = 5;
s = &t
}
This compiles at the time, but the problem with this is that when t
goes out of scope, all its references will become invalid, now s becomes invalid.
If we try to print the value of s
, then we get an error. Rust figures this out using lifetimes.
println!("{s}");
error[E0597]: `t` does not live long enough
--> references/src/main.rs:11:9
|
9 | let t = 5;
| - binding `t` declared here
10 |
11 | s = &t
| ^^ borrowed value does not live long enough
12 | }
| - `t` dropped here while still borrowed
13 |
14 | println!("{s}");
| --- borrow later used here
Each variable has a lifetime associated with the scope in which it was created. Here, s
has a lifetime which we will call a
, and t
has a lifetime which we will call b
.
When we assign s
a reference to t
, rust compares the lifetime of s
and t
and sees that the lifetime of s
is longer than that of t
, now when t
goes out of scope, s
becomes invalid.
fn main() {
// LIFETIMES
//---------------------- 'a
let s; //|
//|
{ //|
//---------------'b |
let t = 5; //| |
//| |
s = &t //| |
//---------------'b |
} //|
//---------------------|'a
}
Most times, we do not need to specify the lifetime of a reference since it can be implied by its scope.
But to use a reference in a struct and sometimes in a function, you need to specify the lifetimes of each reference.
To understand specifying lifetimes in a function, let’s look at these 3 functions.
In the first function, we try to return a reference for a variable created inside this function. But when the function scope ends the variable x
is dropped and the reference becomes invalid, so this function returns an invalid reference therefore it wouldn’t compile.
fn example_1() -> &i32 {
let x = 2;
&x
}
error[E0106]: missing lifetime specifier
--> references/src/main.rs:9:19
|
9 | fn example_1() -> &i32 {
| ^ expected named lifetime parameter
|
= help: this function's return type contains a borrowed
value, but there is no value for it to be borrowed from
help: consider using the `'static` lifetime, but this is uncommon unless you're returning a borrowed value from a `const` or a `static`
In the second function, we return x
from the function’s parameter which is a reference of an integer. It compiles successfully as the compiler assigns the lifetime of the return value of this function to that of x
in the parameter.
fn example_2(x: &i32) -> &i32 {
&x
}
So when we make use of this function, here, the variable b
, is valid and given the lifetime of a
.
fn main() {
let a = 5;
let b = example_2(&a);
}
Even if we create an inner scope and assign b
to the return value of this function, we can make use of b
outside the scope, as their lifetimes are the same.
fn main() {
let a = 5;
let b;
{
b = example_2(&a);
}
println!("b = {b}");
}
b = 5
For the third function, here we have two parameters, now the compiler can not figure out which lifetime to assign the return value of the function, so we manually add a lifetime to the function.
fn example_3(x: &i32, y: &i32) -> &i32 {
&x
}
error[E0106]: missing lifetime specifier
--> references/src/main.rs:19:35
|
19 | fn example_3(x: &i32, y: &i32) -> &i32 {
| ---- ---- ^ expected named lifetime parameter
|
= help: this function's return type contains a borrowed value,
but the signature does not say whether it is borrowed from `x` or `y`
help: consider introducing a named lifetime parameter
|
19 | fn example_3<'a>(x: &'a i32, y: &'a i32) -> &'a i32 {
| ++++ ++ ++ ++
We label the lifetime of a reference by adding an apostrophe after the ampersand followed by the letter we choose to label it with, in small cases.
&'a i32
To add a lifetime in a function, we specify the label of the lifetimes used in the function in an angle bracket, then give each reference parameter and return values of their lifetimes after their ampersand.
fn example_3<'a>(x: &'a i32, y: &'a i32) -> &'a i32 {
&x
}
Now this tells the compiler that the return value lives as long as both parameters live. If one of the two is dropped, the output of the function is dropped too.
To learn more about lifetimes check out this articles [?out soon].
Thanks for reading.