prophet
@prophet

Be very explicit about the difference between type variables, unification variables and skolems

Type variables are always bound by foralls and represent variables that can be instantiated to any possible concrete type. For example, forall a. a -> a can be instantiated to Int -> Int, String -> String or Bool -> Bool

Unification variables are placeholders that stand for as of yet unknown concrete types. These usually originate from usages of polymorphic types (where type variables are instantiated with fresh(!!!) unification variables at every use site, since every use site instantiates the type variable to a possibly different concrete type, but which one that is might not be known yet), but they are more generally useful in places where a type is needed that might not be fully known yet (e.g. unannotated lambda parameters or recursive let bindings)

Skolems on the other hand are type constants that are only equal to themselves. These are needed when checking against polymorphic types. The idea is that a function with type forall a. a -> a cannot make any assumptions about the type of its parameter, which is ensured by checking it against the type #a -> #a (where #a is a fresh skolem).

These three are often confused since they could all reasonably be called "type variables", but doing so will only bring you pain. They are distinct concepts and should really be treated as such.

Unfortunately GHC is not great at this. In error messages, unification variables are usually displayed with a trailing number (a0), although something like a1 can either be a unification variable or a skolem if just a would be ambiguous. The only way to tell which one it is is by looking for 'a1' is a rigid type variable bound by... (you know, in the section of the error that noone actually reads because it's massive and almost never useful)


You must log in to comment.

in reply to @prophet's post:

Yes!
A type variable syntactically represents a place where a type can be inserted, but the variable itself is immutable.
Instantiating a polymorphic type replaces the variable (as in, returns a new type with the variable replaced) with a fresh unification variable that might be mutated.

In fact, in practice, all three kinds of variables will have slightly different implementations

data Type
  = TypeVariable Name
  | UnificationVariable Name Unique (IORef (Maybe Type))
  | Skolem Name Unique

It is possible to treat unquantified type variables as skolems (and this is how inference rules are usually written), but it's a bit error prone. One still needs to replace forall-bound type variables with fresh ones when skolemizing to avoid name capture, so there isn't much to be gained this way tbh.

hmm, interesting! in the one video i watched about unification i remember the unification happening somewhat out-of-place (ie, not mutatively), so i guess thats what i was thinking

something like the unification algorithm returning a collection of eq classes of variables; you could choose to replace each with a representative or with some other type

You can use a Map Unique Type as an explicit substitution instead of mutable references. It's just slower, less convenient and more error prone this way (because you can forget to apply the substitution).
The version with mutable references also collects eq classes as an implicit global union find data structure! You can find the representative of a unification variable's eq class by following its reference until you hit either a concrete type or an unsubstituted unification variable (i.e. one where the reference is Nothing).
It took me like two years to make that connection ^^

That's an interesting connection!
On the theoretical side, unification variables don't actually exist as part of the language, they're more of a hallucination of the type checker.
In fact, inference rules don't even mention them, they just "guess" a type. E.g. the standard rule for instantiation

———————————————————————————
Γ, x : ­∀a. τ1 ⊢ x : τ1[τ\a]

replaces every type variable with a concrete type but doesn't specify which one (which is fine because inference rules are relations, not functions).

I think I have run into exactly this confusion in the process of implementing my own type checker. I'm running type inference on every expression from the inside out, so if an expression references a name, even if that name is defined at the same scope, I won't know about that yet. All I can do is infer the type constraints on that name and then check if those inferred constraints are compatible with the actual type of that named value.

But if I have function f : Maybe String -> () and I see the expression f x then surely I can infer that x : Maybe String right? Well not quite, because x could be Nothing, in which case its type would be Maybe a. But if I make x a parameter of the current scope, then its type is inferred to be Maybe String. And if I add a type annotation that x is actually a Maybe a it no longer lets me use it in a place that expects a Maybe String. So what's going on here? Well, it must be that the a in Nothing : Maybe a and the a in the parameter are not the same kind of thing

Here's an example

scope : Maybe a -> ()
scope maybe =
    let
        strListWithMaybe =
            -- `maybe` is inferred to be `Maybe String` which is not compatible with `Maybe a`
            [ Just "hi", maybe ]
            --           ^^^^^

        none : Maybe b
        none =
            Nothing

        strListWithNone =
            [ Just "hi", none ] -- ✅ all good
        intListWithNone =
            [ Just 123, none ] -- ✅ all good
        charListWithNone =
            [ Just 'c', none ] -- ✅ all good
    in ()

I think the solution here is like you say: distinguish between a type param that is a rigid unification variable – like the a in maybe : Maybe a inside of scope – and a skolem, like the b in none : Maybe b which is really a local type scheme none : forall b. Maybe b to be substituted with a concrete type wherever it is used

Yeah, that's exactly the issue!

The type of Nothing is not Maybe a, but forall a. Maybe a. If you use the constructor somewhere, you need to instantiate that type (because inference only works with monotypes) to Maybe ?a where ?a is a unification variable.
Maybe ?a then unifies with Maybe String just fine.

But if you check the body of a function against a polytype like forall a. Maybe a -> (), that type is not instantiated, but skolemized to Maybe #a -> (). It's still important that this #a is fresh since you don't want it to conflict with other skolemizations of type variables that happen to be called a, but it's not a unification variable.

Also, this is probably not too relevant to your type system, but in general, a bidirectional type system checks the argument to f against Maybe a. If your type system has subtyping (e.g. through polymorphism), the argument type doesn't need to match the expected type exactly, but it can also be a subtype.
In this case, another way to view it is that forall a. Maybe a is a subtype of Maybe String and therefore should be accepted as an argument.

Mm interesting. I'll have to adjust my implementation then.

My approach so far has been:

  1. whenever a name is referenced, keep track of any constraints imposed on it by its usage – f x implies x must be at least as specific as f requires its input to be
  2. eagerly try to unify all these constraints – e.g if we see x being used in one place that expects a (String, a) and in another place that expects a (b, Int) then that can be unified to (String, Int)
  3. when we get to x's declaration, infer its type from the expression body of its definition (or its type annotation)
  4. and try to unify that type with the other, already unified, constraints from usage

But I then realised that doesn't allow for any polymorphism. If function g : a -> () is ever called with a String, g's type parameter a is immediately unified with String, causing g to be narrowed to String -> (), destroying its polymorphism.

I then thought of fixing that by letting named values' inferred-by-definition types be as general as they need to be, by letting the a in none : Maybe a be a generic that can be unified with the expected value of the context it is situated in – e.g. [ Just 123, none ] makes none : forall a. Maybe a unify with the Maybe Int of the expression it is in – without letting that constraint-from-usage flow back to the named value none itself.

But that cannot be done with my gather-usages-and-unify-as-you-go approach, because as soon as we see a Maybe String constraint from here and a Maybe Int from there, that will already register as a type error because those can't be unified. But I was getting really confused why that didn't work. Surely if there are these two constraints then that is indeed a clash? But if that's the case then how is it possible that none : Maybe a can indeed be unified with both concrete types? Should we not try to unify type parameters of a type then? So that if we see f x should we only infer x : Maybe a even though we know it must be a Maybe String? I just couldn't make sense of it, until I came across this post and https://cohost.org/prophet/post/3169471-distinguish-between. It's only then that I realised that these are two different things we're dealing with, and Elm not distinguishing between these two different kinds of generic a is supremely unhelpful.

So in order to get the correct behaviour I think what I need to do is:
a) keep track of all constraints on value x without passing judgment on whether those can all be unified, until:
b) we actually see x's definition, and only then check if those inferred constraints can be unified with x's actual type
c) and if x is not a value with a definition in the current scope – say x is a parameter in the function you are in – only then do we indeed try to unify all the gathered constraints. If they do unify: great, that's your inferred type of parameter x. If not, oopsie, you indeed have a type error. Because at that point a is not a local type variable that can unify with anything and everything, but a stand-in for a single, rigid, type. Which is either a String or an Int, but certainly not both.

After thinking about this for a few more days, am I right in thinking that skolems and type variables are really two sides of the same coin?

That inside the scope where type variable a is used we call it a skolem and it acts as a stand-in for any type, meaning that we cannot make any assumptions about what it is – but outside the scope that a is defined, we call it a type variable and there it can be substituted with any and every type. Why can a type variable unify with every type? Well, precisely because inside the scope (where we call it a skolem) we made sure that it is compatible with any type. So since we've done the hard work of making sure not to expect it to be any specific type, we can substitute anything for a and be confident that our function will still work with that substituted type.

Sort of like... if you're expecting a guest but you have no idea what language they speak, you better make sure that your house is fully accessible without expecting the guest to be able to read any instructions you could write down for them. You have to be really strict about this and even your alarm system better be incredibly simple and intuitive so it can be operated without the operator having to understand a single word of English.
But once you've done that work, you can now host any guest of any nationality!

In this example this prospective guest of unknown nationality that you have to accommodate is the skolem. And the actual guest who may speak any of French/English/Swahili/Uzbek/etc is the type variable.

Have I understood that correctly?

Yes that's exactly the insight behind skolemization and typing polymorphic functions! I like your metaphor!

If you want something fun to think about, consider how one would prove a universally quantified statement in propositional logic in pretty much the same way (by proving it for an arbitrary value without making any assumptions about it) and how, by propositions as types, that's actually the exact same thing as skolemization in type inference :)

Be careful though! The variables that unify with anything are not type variables! They're unification variables.

Type variables are a syntactic construct of the language that is part of a universally quantified type (e.g. forall a. a -> a). On its own, a type variable doesn't really mean much except that when instantiating the function at a concrete type, that's where it's going to go.
Only when we decide what we're going to do with a polymorphic type, do we imbue it with meaning by instantiating it.

If we want to use a variable of a polymorphic type, we can instantiate its type variables with anything, but we don't necessarily know what we need to instantiate it with yet (unless we have a type application. In that case we can just instantiate it directly with the applied type). That's why we don't make that decision immediately but just instantiate them with placeholders (unification variables) and defer figuring out the exact instantiations until unification.

If we want to check a declaration against a polymorphic type, we need to do exactly what you described and show that it is valid for any possible instantiation, so we instantiate type variables with arbitrary constants that we cannot make any assumptions about other than that they're equal to themselves - skolems.

Also, if we're defining a declaration without a type signature, we don't know its type yet so we use a unification variable (not a type variable!).
But if by the end of checking this declaration, we come up with a type that mentions unbound unification variables that don't escape the declaration, we know that the declaration doesn't make any assumptions about them so we could instantiate them with any type and can therefore extend our type to a universal with a proper type variable where the unification variable was. (e.g. ?a -> ?a might become forall a. a -> a). That's what generalization is! It's the same concept once again.