Types are about whole-program information

2021-11-01

The debate about types is kind-of played out, but I think that's mostly down to perspective, and the way we talk about types as tools. "Types help me make fewer mistakes!" or "Types get in my way!" Both of these sentiments are common and both are true in their own way, but they are looking at the problem from a localized "my workflow" view.

A typechecker is a collection of rules. You run your code through these rules, and it either conforms with the rules or not. You write code, annotating the "do real work" bits of your code with a sub-language ("types") that the typechecker understands, and when you do this, you get answers back about if your code abides by the typechecker rules, and if not, why not.

This has costs. You pay in time spent adding the type annotations. You pay in compile time. You pay in time spent trying to understand compiler errors, and trying to understand why something doesn't compile. As an organization, you pay in time spent training people on these tools, and time spent waiting for them to ramp up in productivity. You have to live under a cold, mechanical regime that doesn't care about your deadlines or how long you've been working on this piece of software or whatever else. You have to pay these costs in order to participate in the typechecker.

But in exchange for this cost, you get some benefits.

The typechecker doesn't just check your own code. It checks everyone's code. This is crucial: it is a global tool. In order to build and run your project, all of the code must typecheck successfully. This includes your code, your coworkers' code, and the code for all the dependencies you use in that project.

You and everyone on your team gets information about how their code fits together, whenever they want, as many times as they want, for any changes they make, and they get it globally, now and in the future, for all of your code and your dependencies' code.

You can think of the constraints a typechecker imposes as being like the rules of the road, in a way: you have to pay to build and maintain roads, and roads have certain laws to ensure their safety and continued operation, but once they are built, driving becomes much more predictable than attempting to drive across muddy, untamed countryside. You are giving up some local freedom (the freedom to drive with no rules at all) in exchange for a higher-order freedom (you can now drive and navigate much faster, easier, and more reliably).

As far as I can tell, RĂșnar Bjarnason has the best talk on the subject that really digs in to the theory of this kind of composition tradeoff. This talk is one of my favorites and I think it should be required viewing for people in the industry.

Now, in fairness, typecheckers don't say anything about whether the types you've used to model your domain are any good, whether your typesystem itself is actually useful, whether you've solved the right problem, hired the right team, or anything else, but what does? On this point, I think static typing detractors usually undersell their usefulness, especially when it comes to the kind of programming people do in teams over longer periods of time, which is dominated by maintenance, tweaks, little feature additions, refactorings, security patches, and that kind of thing.

I do think that some of the criticism that dynamic typing proponents make of types hits home, though. Some communities - the functional programming communities, in particular - have really sold types poorly, presenting typed programming on the basis of correctness and proof, when the real wins to be had are about increasing our ability to understand our own systems. Proof and correctness are noble values and ought to be pursued, but not, I think, at the expense of a more general kind of engineering utility and intelligibility.

I'm still bullish, though, particularly when comparing typed vs. untyped programming on the axis of time: most load-bearing software in the world has been around for a while, and needs to be around for a while more, because people still depend on it! Most of the software doing useful stuff in the world cannot be called "greenfield". Everyone's experience will be different, but most of my career has been an exercise in "hey, modify this existing thing in some way".

I have worked on greenfield projects, but the actual greenfield phase of the project, where you're just slamming out code until the thing works at all, only lasts a few months at most. Once that's over, you have users, you have expectations, you have maintenance. At that point, what you want is to understand your system as holistically as possible. Your ability to write code as quickly as possible is no longer the thing holding back your organization. Your ability to understand how your system interacts with itself matters much more.

Personally, I'm not going to stop writing dynamically typed code. There are some really great, productive dynamic languages out there (I love Elixir in particular), but for code that has to last any length of time, I think that constraining ourselves to the subset of programs that can compile under a reasonable typechecker is a productive restriction, like an artist limiting their pallette to fewer colors intentionally, or an aircraft designer limiting themselves to the subset of aircraft designs that are aerodynamically stable. Not all constraints are actually limiting.

Of course, typed programming is just one way of understanding how a system interacts with itself. There are other ways to approach this question, but discussing those ways probably deserves its own post, and I wanted to write about types.