Compile-safe builder pattern using phantom types

Posted on Dec 1, 2021 (updated on Feb 1, 2024)

A road builder in the mountains showing a long curvy road towards a lake.

Introduction #

In this short article, we create a classic builder pattern to demonstrate the power of phantom types. Phantom types provide extra information to the compiler so that it can introduce extra constraints and check whether they hold at compile time. The program will fail to compile if one or all of the constraints don’t hold. Thus, you can prevent running into costly runtime issues by leveraging phantom types. Finally, as a bonus, they do not come with extra runtime overhead, as the phantom types are needed only for compilation, and will be erased from the actual bytecode.

Builder pattern in Java #

Firstly, let’s have a look at the builder pattern in Java:

java
public record Person(String firstName, String lastName, String email) {}

The builder itself could look something like this:

java
package com.company;

public class PersonBuilder {
    private string firstname;
    private string lastname;
    private string email;
    
    public PersonBuilder firstname(string firstname) {
        this.firstname = firstname;
        return this;
    }
    
    private PersonBuilder lastname(string lastname) {
        this.lastname = lastname;
        return this;
    }
    
    public PersonBuilder email(string email) {
        this.email = email;
        return this;
    }

    public firstname() {
        return this.firstname;
    }

    public lastname() {
        return this.lastname;
    }

    public email() {
        return this.email;
    }

    public Person build() { // optionally validate and throw exceptions for missing input 
        return new person(firstname, lastname, email);
    }
}

when the build method is called, there’s no way to guarantee that all the fields of the Person are specified. In the above code, that can only be determined during runtime.

java
Person person = new PersonBuilder()
                      .firstname("hello")
                      .lastname("world")
                      .build(); // oops, we forgot to specify the email

Surely, you could introduce exceptions to deal with missing input, but then the code is no longer referential transparent as side effects could occur, and you would need to deal with these as well.

An expression is said to be referentially transparent if it can be replaced by its value without changing the program’s behaviour. — Wikipedia

You could decide to write more tests, as the behaviour is only apparent during runtime. However, there’s no need to do this if we can already prevent it in the first place by the Scala compiler. Why test something that is already guarded by the compiler?

Builder pattern in Scala #

Let’s share the entire code and then go through it step by step:

scala
import PersonBuilder.{Email, FirstName, FullPerson, LastName, PersonBuilderState}

case class Person(firstName: String, lastName: String, email: String)

object PersonBuilder {
  sealed trait PersonBuilderState
  sealed trait Empty extends PersonBuilderState
  sealed trait FirstName extends PersonBuilderState
  sealed trait LastName extends PersonBuilderState
  sealed trait Email extends PersonBuilderState

  type FullPerson = Empty with FirstName with LastName with Email

  def apply(): PersonBuilder[Empty] = new PersonBuilder("", "", "")
}

class PersonBuilder[State <: PersonBuilderState] private (
    val firstName: String,
    val lastName: String,
    val email: String) {
  def firstName(firstName: String): PersonBuilder[State with FirstName] =
    new PersonBuilder(firstName, lastName, email)

  def lastName(lastName: String): PersonBuilder[State with LastName] =
    new PersonBuilder(firstName, lastName, email)

  def email(email: String): PersonBuilder[State with Email] =
    new PersonBuilder(firstName, lastName, email)

  def build()(implicit ev: State =:= FullPerson): Person = {
    Person(firstName, lastName, email)
  }
}

We can start using the builder as follows:

scala
val person = PersonBuilder()
  .firstName("Hello")
  .lastName("World")
  .build // Oops, we forgot to specify the email

If you try to compile this code, you will not be able to:

text
Cannot prove that PersonBuilder.Empty with PersonBuilder.FirstName with PersonBuilder.* LastName =:= PersonBuilder.FullPerson

If we squint our eyes a bit, then it says:

scala
(Empty with FirstName with LastName) != FullPerson

This is correct, as our definition of FullPerson demands us to include an e-mail address:

scala
type FullPerson = Empty with FirstName with LastName with Email

When you include the e-mail as well, you’ll see that everything compiles fine again:

scala
val person = PersonBuilder()
  .firstName("Hello")
  .lastName("World")
  .email("hello@world.com") // By adding the e-mail, it will compile again
  .build

Using phantom types #

So how does this all work? In this paragraph, we go through the code and give more details on the code.

Firstly, we define a set of properties which we would like to use:

scala
sealed trait PersonBuilderState
sealed trait Empty extends PersonBuilderState
sealed trait FirstName extends PersonBuilderState
sealed trait LastName extends PersonBuilderState
sealed trait Email extends PersonBuilderState

In the table below, you can see all the states that we have defined.

State Description
Empty None of the properties have been set.
FirstName The first name is set.
LastName The last name is set.
Email The e-mail is set.

Secondly, we’ll introduce a new type by specifying that a FullPerson is the combination of an Empty person with a FirstName plus a LastName plus an Email.

scala
type FullPerson = Empty with FirstName with LastName with Email

Next, we create a type class and use a generic to constrain it to be of type PersonBuilderState.

scala
class PersonBuilder[State <: PersonBuilderState]

In addition, we extend the State for each of the methods, to provide additional type information to the compiler. It takes the State and extends it with FirstName:

scala
def firstName(firstName: String): PersonBuilder[State with FirstName]

Imagine the next sequential scenarios:

Step Result
1. We create a new PersonBuilder(). The State becomes Empty.
2. We set the first name on the builder. The State becomes Empty with FirstName.
3. We set the e-mail on the builder. The State becomes Empty with FirstName with Email
4. We set the last name on the builder. The State becomes Empty with FirstName with Email with LastName

As you can see, the order doesn’t matter. We only need to make sure that we have a FullPerson the moment we call the build method.

The last part of the puzzle is the actual build method itself:

scala
def build()(implicit ev: State =:= FullPerson)

The magic symbol here is =:=. It is used for expressing equality constraints. In other words, the compiler needs to prove that State at the moment of executing the build method is a FullPerson. If the compiler cannot prove this, you’ll run into a compile error.

Conclusion #

And that’s it! Phantom types is a very powerful concept that can be used to make your code a lot more robust. They allow us to catch issues in the earliest stage possible, namely at compile time, which is also the cheapest stage to fix issues.