Implementing a Linked List in Rust

An introduction to data structures in Rust with a simple linked list example

Published on
May 30, 2024

Read time
9 min read

Introduction

I’m a self taught programmer and I have been learning Rust for the past few months. I don’t have a particularly good reason to use Rust. I mainly work in web development and the problems I’m typically solving do not require the kind of performance enhancements that involve switching to a completely different language!

Instead, I’m using it as a way to deepen my understanding of how programming actually works and what higher-level languages like JavaScript and Python are doing for us under the hood.

With that goal in mind, why not also try to firm up my understanding of data structures?

Why a linked list?

A linked list is one of the fundamental data structures in computer science.

But if you’re a web developer, you’re unlikely to need a specific list implementation in your front-end application code. It’s faster to just use an array, because that’s built in to the language and optimised in whatever low-level language the underlying JavaScript engine uses (usually C++).

In Rust, however, choosing the right data structure really does make a difference. Because we have access to low-level tools like memory management, the right data structure can give us an edge in terms of performance or memory.

If you’re learning Rust or data structures (or both), you’ve come to the right place!

Starting on familiar territory

If you know a higher-level langauge, you’ll be familiar with arrays. These are a type of list where each item is given an index, usually starting from zero, like below.

Using indices gives us a powerful advantage, which is that accessing an element takes a fixed amount of time, regardless of the size of the array: in Big O Notation, we call this O(1).

This efficiency is due to the way arrays use memory, known as the "memory layout". Each item in an array is stored in a "contiguous" block of memory: we know the size of each item and the starting address of the array, so the position of any index can be calculated directly.

So, in arrays it’s very quick to access an element. But other operations are slower.

Some limitations of arrays

What about inserting new elements? If we push an element to the end of the array, that also occurs in constant time.

But if we need to insert a new item at the beginning of the array, the index of every other item needs to be incremented. 0 becomes 1, 1 becomes 2, and so on. This means the operation is linear.

Therefore, insertion has a best-case scenario of O(1) but a worst-case scenario of O(N). On average, because the time taken to insert an item scales with the length of the array, we call it O(N).

If we need to insert items quickly at the start, we’ll need a different data structure...

Introducing the linked list (with a tail pointer)

In a linked list with a tail pointer, we can insert an element in constant time, either at the beginning or - thanks to the tail pointer - at the end. But accessing an element is linear.

Each element or node in a linked list contains:

  • a value
  • a reference to the next element in the sequence.

For example:

Above, the first node contains the value 13 and a reference (represented by an arrow) to the next node. The second node contains the value 45 and a reference to the next node, and so on.

The last element in the sequence has a reference to nothing, which is usually represented as None in Rust.

It is also helpful to keep a record of the first and last elements in the sequence. These are called the head and tail, respectively:

(If we keep a record of the tail, we can also insert items at the end of the list in constant time.)

Efficient insertion could be useful in many contexts, such as implementing a queue or a stack, where elements are frequently added and removed from the beginning or end of the list.

Notable Rust features for our implementation

Before diving into the implementation, let’s review some Rust features that will help us create a performant linked list.

Here, I’ll focus on concepts related to memory management, which may be less familiar to people coming from higher-level, garbage-collected languages.

Box

In Rust, the Box type is used to allocate memory on the heap rather than the stack. This is important for data structures whose size is not known at compile time, and since linked lists are dynamic in size, we need to use Box to store our nodes.

Unsafe

Rust guarantees memory safety at compile time. However, this can be restrictive when we’re trying to write code that is as efficient as possible.

For example, instead of cloning a value, we might want to manually move pointers to avoid creating unnecessary copies. This is more efficient, but Rust can no longer guarantee that the code is safe: we use the unsafe keyword to bypass these restrictions when necessary.

This allows us to (carefully!) use powerful low-level functions, such as Box::from_raw.

It’s worth noting that we certainly don’t need unsafe: it’s a trade-off and there’s an example at the end of this article of an alternative approach.

Dereferencing

Borrowing is a fundamental concept in Rust. It allows us to pass references to values rather than the values themselves, which is more efficient—and sometimes simpler to manage.

When we start learning Rust, we learn to use the & and &mut operators to borrow values.

However, when working with optional value, the as_deref and as_deref_mut functions allow us to borrow the value inside of the Option type.

More specifically, they change Option<&T> to Option<&U>, where U is a reference to T.

Implementing a linked list

Now that we’ve reviewed the basic concepts and Rust features, let’s implement a linked list.

Defining the node

We’ll start by defining a LinkedListNode struct that contains a value and a reference to the next node.

#[derive(Debug)]
struct LinkedListNode<T> {
    value: T,
    next: Option<Box<LinkedListNode<T>>>,
}

Here, Box is used in this code because we don't know the size of next at compile time.

We’ll use a generic type T to allow the node to store any type of value. I’ll also add the Debug trait to every struct we create, which will allow us to print the struct using the dbg! macro.

Next, we’ll implement a new function that creates a new node with a given value.

impl<T> LinkedListNode<T> {
    fn new(value: T, next: Option<Box<LinkedListNode<T>>>) -> LinkedListNode<T> {
        LinkedListNode { value, next }
    }
}

Defining the list

We’ll now define a LinkedList struct that contains the head and tail of the list.

#[derive(Debug)]
struct LinkedList<T> {
    head: Option<Box<LinkedListNode<T>>>,
    tail: Option<*mut LinkedListNode<T>>,
}

For the tail field, we can use a raw pointer *mut LinkedListNode<T> to avoid borrowing issues when we need to modify the tail.

As with the LinkedListNode, we’ll implement a new function that creates a new list with no elements.

impl<T: PartialEq> LinkedList<T> {
    fn new() -> LinkedList<T> {
        LinkedList {
            head: None,
            tail: None,
        }
    }
}

It’s time to start implementing the methods that will allow us to interact with the list!

Since linked lists are great for inserting elements at the beginning, we’ll start by implementing a prepend method to do just that. All the methods we implement should be part of the impl block for the LinkedList struct.

Prepend

First, we’ll create a new node with the given value and a reference to the current head. We can use the take method to move the existing head value and leave None in its place.

fn prepend(&mut self, value: T) {
    let new_node = Box::new(LinkedListNode::new(value, self.head.take()));

    // move the new_node into the head
}

Next, we’ll move the new node into the head.

fn prepend(&mut self, value: T) {
    let new_node = Box::new(LinkedListNode::new(value, self.head.take()));

    self.head = Some(new_node);
}

So far, pretty simple!

However, we also need to update the tail if it’s None. To save us from unnecessarily cloning new_node, we can instead store the tail as a raw pointer. This is where the unsafe keyword comes in.

In the unsafe block, we’ll convert the new_node into a raw pointer using Box::into_raw. We can then set the tail to this pointer, and finally, we’ll use Box::from_raw to convert the raw pointer back into a Box and set this as the new head.

impl<T: PartialEq> LinkedList<T> {
  // other methods

  fn prepend(&mut self, value: T) {
      let new_node = Box::new(LinkedListNode::new(value, self.head.take()));

      unsafe {
          let raw_node: *mut _ = Box::into_raw(new_node);

          if self.tail.is_none() {
              self.tail = Some(raw_node);
          }

          self.head = Some(Box::from_raw(raw_node));
      }
  }
}

We can see that prepending an element requires a constant number of operations, unaffected by the size of the list!

What about appending an element to the end of the list?

Append

We can also append an element in constant time by updating the tail pointer. This is similar to the prepend method, but we need to update the next field of the current tail node.

We also want to account for the case where the list is empty, and the head is None.

fn append(&mut self, value: T) {
    let new_node = Box::new(LinkedListNode::new(value, None));

    unsafe {
        let raw_node: *mut _ = Box::into_raw(new_node);

        if self.head.is_none() {
            self.head = Some(Box::from_raw(raw_node));
        } else if let Some(tail) = self.tail {
            (*tail).next = Some(Box::from_raw(raw_node));
        }

        self.tail = Some(raw_node);
    }
}

The second if statement is a bit more complex. If self.tail has a value, we need to dereference the tail pointer so that we can make next point to the new node.

Finally, we update the tail pointer to the raw pointer value of the new node.

Insert

Next, let’s look at a linear operation. If we want to insert an element at a specific index, we need to traverse the list until we reach the desired index.

This is because, unlike arrays, we can’t directly access the nth element of a linked list. We need to start at the head and follow the next references until we reach the desired index.

Here’s one approach to implementing the insert method:

fn insert(&mut self, value: T, index: usize) {
    if index == 0 {
        self.prepend(value);
        return;
    }

    let mut count = 1;
    let mut current_node = self.head.as_deref_mut();

    while let Some(node) = current_node {
        if count == index {
            let new_node = Box::new(LinkedListNode::new(value, node.next.take()));
            node.next = Some(new_node);
            return;
        }
        current_node = node.next.as_deref_mut();
        count += 1;
    }

    self.append(value);
}

In this method, we first check if the index is zero. If it is, we can prepend the element. Otherwise, we traverse the list until we reach the desired index.

In the loop, we use Box::new to create a new node in the heap with the given value and a reference to the next node. The take method is used to move the next value from the current node and leave None in its place. We can then update the next field of the current node to point to the new node.

Finally, if we reach the end of the list and don’t find the desired index, we can append the element.

Find

Finding an element in a linked list is also a linear operation: we need to traverse the list until we find the element we’re looking for.

fn find(&self, value: T) -> Option<&LinkedListNode<T>> {
    let mut current = self.head.as_deref();

    while let Some(node) = current {
        if node.value == value {
            return Some(node);
        }
        current = node.next.as_deref();
    }
    None
}

Delete

Lastly, let’s implement a method to delete an element from the list. This is also a linear operation, but it’s a bit more complex than the previous methods.

fn delete(&mut self, value: T) {
    // Handle when the list is empty
    if self.head.is_none() {
        return;
    }

    // Handle when the head node needs to be deleted
    if let Some(ref mut head) = self.head {
        if head.value == value {
            self.head = head.next.take();
            if self.head.is_none() {
                self.tail = None;
            }
            return;
        }
    }

    let mut current = self.head.as_deref_mut();

    // Handle when a non-head node needs to be deleted
    while let Some(ref mut node) = current {
        if let Some(ref mut next) = node.next {
            if next.value == value {
                node.next = next.next.take();
                if node.next.is_none() {
                    self.tail = Some(node);
                }
                return;
            }
        }
        current = node.next.as_deref_mut();
    }
}

Deleting the head node is straightforward: we can just move the next value into the head. If the list is empty after deleting the head, we also need to update the tail.

For other nodes, we need to traverse the list until we find the node we want to delete. We can then update the next field of the previous node to point to the node after the one we’re deleting.

Why do we need unsafe?

The use of unsafe is a trade-off where we exchange performance for having to do some memory allocation ourselves.

I wanted to demonstrate an approach that makes this trade-off, as I believe a well-implemented data structure should be as performant as possible, but we could certainly avoid unsafe by relying on standard library methods instead!

Here’s one way to doing this. We could update the type of tail so it is a full node:

#[derive(Debug, Clone)]
struct LinkedList<T> {
    head: Option<Box<LinkedListNode<T>>>,
    tail: Option<Box<LinkedListNode<T>>>,
}

We also need to make sure to implement the Clone trait on the LinkedListNode and LinkedList structs.

Then, taking our prependmethod as an example, we could update it to look something like this:

fn prepend(&mut self, value: T) {
    let new_node = Box::new(LinkedListNode::new(value, self.head.take()));

    if self.tail.is_none() {
        self.tail = Some(new_node.clone());
    }

    self.head = Some(new_node);
}

There is potential overhead associated with cloning, which can be significant for large or complex types. But with an approach like this, we no longer have to worry about the pitfalls of unsafe Rust!


And that’s it! I hope this article has been helpful in understanding the basics of linked lists and how they can be implemented in Rust.

This is, of course, a fairly simple implementation. There are many ways to improve it and many more methods we could add to make it more useful. If you’re interested in exploring further, I recommend looking into doubly linked lists, which allow you to traverse the list in both directions. Each node in a doubly linked list contains a reference to the previous node as well as the next node. Happy coding!

© 2024 Bret Cameron