Introducing Sorbet's T::Enum and T::Struct
A type checker…for Ruby?
I know; it’s blasphemous. Ruby is a beautiful, dynamic language that values developer happiness and productivity and doesn’t need a type system to achieve either of those goals. I’m essentially in agreement, and for many of my smaller projects, I fully embrace Ruby’s simplicity and dynamism to write code faster.
As I have worked on more extensive, legacy Rails code bases, complex Ruby projects, and many gems with unclear interfaces, the value of some systems instructing myself and other developers on my team has become more valuable. Recently at work, I’ve been using Sorbet from Stripe every day, and I increasingly love how it’s helping my team think through our method signatures in a way we might not have considered before. The sheer reduction of NoMethodError on NilClass
makes Sorbet a valuable part of our toolchain.
I also love the incremental approach. Static Sorbet allows us to make certain constants are resolving throughout the codebase quickly. Interested teams can then adopt more type strictness levels on files and invest in adding Sorbet runtime type annotations to embrace the type system further. This isn’t a post on Sorbet adoption, but expect one in the future.
I want to chat about two classes Sorbet provides that I don’t feel have gotten enough attention. They are included in the Sorbet runtime gem and, after using them for a while now, feel like missing pieces of the Ruby standard library. Enter T::Enum
and T::Struct
.
Formally expressing a subset of options
Let’s start with T::Enum
. If you’re familiar with other languages, most have a built-in type called an Enum
or Enumeration
. C#, Java, PHP, and recently Python are a few examples. Simply put, these allow us to express a subset of values that are allowed to represent an instance of this type. Pulling from my recent post on types, the idea of currency is a good example of an enumeration. Our system might start only allowing the USD
currency if we’re a US-based company. Over time, we might expand into Canada and Europe and need to store CAD
and EUR
currencies.
Currently, we don’t have a great way of limiting a subset of options in Ruby. You’ll most likely see something like follows. Note that I updated the code from the other post and switched away from YARD to Sorbet signature annotations. We’ll need the sorbet-runtime
gem loaded for this code to work.
class Currency
extend T::Sig
USD = "USD".freeze
CAD = "CAD".freeze
EUR = "EUR".freeze
ALLOWED_CURRENCIES = [USD]
sig { params(currency_string: String).void }
def initialize(currency_string:)
raise ArgumentError unless ALLOWED_CURRENCIES.include?(currency_string.upcase)
@currency_string = currency_string
end
sig { returns(String) }
def symbol
case currency_string
when USD
"$"
when CAD
"CA$"
when EUR
"€"
else
""# or raise an exception
end
end
end
We need to handle validation of the passed in String
to this class and make sure it is an allowed currency. The way we refer to the allowed values is pretty simple, i.e., Currency::USD
. However, the underlying type is still a String
and not a Currency
, so it can be hard to express this as the required type in a method signature. Finally, we’ve got a convenient method for deciding what symbol to display using a case
statement. This behavior is acceptable, but we’ll always have a potential default case (regardless of our guard clause in initialize
) for any String
we don’t understand.
There’s another insidious side effect here as well. Let’s say we decide to add the yuan currency. We would add the new constant to this class. But what if we forget to update the ALLOWED_VALUES
array? Or update our symbol
method to return a value for the yuan?
Further, if another class somewhere else in the system is making a decision based on the currency type, how does it know there’s a new value to handle? The short answer is another code doesn’t know about this new value. We’ll probably need to wait until we hit a runtime error to see this oversight.
Let’s see how we could use the Sorbet T::Enum
type to clean this up.
class Currency < T::Enum
extend T::Sig
enums do
USD = new
CAD = new
EUR = new
end
sig { returns(String) }
def symbol
case self
when USD
"$"
when CAD
"CA$"
when EUR
"€"
else
T.absurd self
end
end
end
Check out how this is immediately more readable and expressive for defining our enumeration. We can eliminate our initialize
method. Also, instead of storing strings in constants, the options specified in the enums
block are classes in and of themselves. We still refer to these like the previous version, i.e., Currency::USD
. The main difference is that this is now both the type and the value we can pass to a method. If we say we need a Currency
, we can pass in Currency::EUR
directly without initializing a Currency
object.
Our exhaustiveness check inside our symbol
method is also alleviated. We still need the else
block because Ruby can’t determine we’re handling all cases, but the Sorbet type checker will complain if we leave off an enumeration value. This uncertainty applies to any part of the codebase relying on a Currency
type; we’ll immediately see errors from srb tc
that point us to the code we need to update.
This enumeration is what we were trying to express in the original code. With Sorbet’s assistance, we can fully express the values and have safety throughout the codebase if we ever add or remove an option from this Enum
.
One last feature is the built-in serialize
and deserialize
methods on T::Enum
. Because we’ve swapped the underlying implementation to a Ruby Class
instead of a String
, if we needed to pass this Enum
in a JSON response or store it in a database, we need to case it to a more primitive type. Here’s the default behavior for this with our current Enum
.
Currency::USD.serialize # => "usd"
Currency::EUR.serialize # => "eur"
Currency.deserialize("cad") # => Currency::CAD
By default, Sorbet will downcase the enum value name and store it as a String
. We can specify how we want the serialization to occur as an argument when creating our values in the enums
block.
enums do
USD = new("USD")
CAD = new("CAD")
EUR = new("EUR")
end
Here’s how the previous statements look now.
Currency::USD.serialize # => "USD"
Currency::EUR.serialize # => "EUR"
Currency.deserialize("CAD") # => Currency::CAD
Adding a powerful data packaging class
Ruby has Struct and OpenStruct built-in, which are convenience classes that aid with creating simple data classes; effectively just getters and setters for fields. Other languages have a similar idea; Kotlin introduces a data
class, and C# has structs built-in.
Sorbet gives us a more powerful version of a Struct
to play with called (you guessed it) T::Struct
. In addition to being initialized like Ruby’s built-in structs, we can also define types for each field and whether they are allowed to change or must remain constant (mutable or immutable). Pulling from this post again, let’s see how we can improve the Money
class.
class Money
attr_reader :cents, :currency
# @param cents [Integer]
# @param currency [Currency]
# @raise [ArgumentError]
def initialize(cents:, currency:)
raise ArgumentError unless cents > 0
@cents = cents
@currency = currency
end
end
Money
is a very straightforward class. We take in two pieces of data in initialize
, and then we add some attr_readers
for them. Note that the lack of an attr_writer
/attr_accessor
means that once this an instance is initialized, this data is immutable. Here’s how we could express this as a T::Struct
.
class Money < T::Struct
const :cents, Integer, default: 0
const :currency, Currency, default: Currency::USD
prop :wrinkly, T::Boolean, default: false
end
Money.new # == Money.new(cents: 0, currency: Currency::USD, wrinkly: false)
Money.new(cents: 100_00, currency: Currency::CAD) # => CA$100 crisp
We took an already straightforward class and made it that much simpler. First, we inherit from T::Struct
to enable the const
and prop
class-level methods. Then, we declare cents
as a const
field, meaning it is immutable (we only supply the reader). We also can type it and set a default. Using our new Currency
enum, we do something similar with currency. Finally, I introduced a contrived field to show an example of prop
and T::Boolean
. wrinkly
is a mutable field that supplies an attr_accessor
for this field on the instance. Generally, I avoid prop
fields, as I favor an immutable approach, especially for my structs.
One thing to note, we did lose our guard clause from the previous initialize
method. We could implement our own initialize
and then have it call super
to get the out-of-the-box behavior. While working with structs, I tend to avoid argument errors and conditional guards and instead rely on the given types to support me. If someone gives me a value that matches the type, it should be a valid instance of the struct. I might need other error handling code elsewhere in my system, but we should not prevent the struct from being initialized. In this case, if a negative number is given for cents, we can start modeling that has a debit cash flow versus a positive cash flow.
In conclusion
In addition to being a killer companion when analyzing static Ruby code, Sorbet provides us with some great built-in behavior that, as you use it more, you begin to wish was a core part of Ruby. Both T::Enum
and T::Struct
primarily exist to make writing typed code more straightforward and expressive in Ruby, which some might count as a detriment to Ruby’s dynamic nature. As I have used them, the opposite has occurred. I tend to spend more time thinking through my basic data types as immutable structs to be passed around the system. I think through how I could represent a subset of options as enums that can be spread throughout the system so I can see where changes need to be made. Sorbet is changing how I approach designing Ruby systems, and I’m on board for the ride.