Canonical LF

This is a representation of LF in which all terms are automatically in canonical form. The key idea is to segregate the type families and the terms into atomic and normal forms where the term variables only stand for atomic terms, and not arbitrary ones. Then, a substitution procedure is defined that takes terms in normal form and performs a substitution for a variable while simultaneously normalizing any redices that occur.

{-# LANGUAGE DeriveGeneric, StandaloneDeriving, DeriveDataTypeable,
    ViewPatterns, RankNTypes, FlexibleContexts, FlexibleInstances,
    FunctionalDependencies, TypeFamilies
  #-}
module CanonicalLF where
import Unbound.Generics.LocallyNameless
import GHC.Generics (Generic)
import Data.Typeable (Typeable)
import qualified Data.Map as M
import Control.Monad.Reader
import Control.Monad.Except
import Data.Functor.Identity
import Control.Applicative (Const(..))

Syntax

An LF signature introduces type family atoms and constant terms.

data Signature = NilS
  | SnocAtom (Rebind Signature (Atm, Embed Kind))
  | SnocConst (Rebind Signature (Cnst, Embed Type))
  deriving (Show, Generic, Typeable)

The type families are classified by kinds and may either be plain types, or pi-kinds for families of types indexed by a term variable.

data Kind = TypeK | PiK (Bind (Var, Embed Type) Kind)
  deriving (Show, Generic, Typeable)

The atomic type families are either type familiy atoms applied to zero or more terms in normal form.

type Atm = Name P
data P = AtmP Atm | AppP P Term
  deriving (Show, Generic, Typeable)

Type families in normal form are either atomic type families or dependent product types indexed by a term variable of normal type.

data Type = PT P | PiT (Bind (Var, Embed Type) Type)
  deriving (Show, Generic, Typeable)

The atomic terms are either variables or constants applied to zero or more terms in normal form.

type Cnst = Name R
type Var = Name R
data R = VarR Var | ConstR Cnst | AppR R Term
  deriving (Show, Generic, Typeable)

A term in normal form is either an atomic term or a lambda abstraction that binds a term variable.

data Term = RM R | LamM (Bind Var Term)
  deriving (Show, Generic, Typeable)

When typechecking kinds, types or terms, new term variables may come into scope. They are collected in contexts.

data Context = NilC
             | Snoc (Rebind Context (Var, Embed Type))
             deriving (Show, Generic, Typeable)

All the syntactic objects are equivalent upto renaming of bound variables.

instance Alpha Signature
instance Alpha Kind
instance Alpha P
instance Alpha Type
instance Alpha R
instance Alpha Term
instance Alpha Context

The metatheory of Canonical LF uses simple types to prove the termination of hereditary substitution (defined below). But they aren't needed in the implementation. (Although it would be interesting to lift them to Haskell kinds and index the terms by the simple types to disallow some malformed terms.)

data SimpleType = AtmS Atm | ArrS SimpleType SimpleType
  deriving (Show, Generic, Typeable)

instance Alpha SimpleType

Hereditary Substitution

Variables in Canonical LF stand for atomic terms, but we will need to subtitute terms for them. If we used ordinary capture-avoiding substitution, such substitution would produce redices, which we are precisely what we don't want. However redices will potentially only occur when the variable for which we're substituting occurs at the head of an atomic term.

isHeadVarR :: Var -> R -> Bool
isHeadVarR x (VarR y) = x == y
isHeadVarR _ (ConstR _) = False
isHeadVarR x (AppR r _) = isHeadVarR x r

Just using a boolean to decide if the variable is at the head of an atomic term is fine, but we can actually partition an atomic term into its head variable or constant together with a spine of applications.

data Spine a = NilSp a | AppSp (Spine a) Term
data Head = VarH Var | ConstH Cnst

If the variable for which we'll be substituting is at the head we only really care about the spine. Otherwise we have some other variable, or perhaps a constant at the head.

headSpine :: Var -> R -> Either (Spine Head) (Spine ())
headSpine x (VarR y) | x == y    = Right (NilSp ())
                     | otherwise = Left (NilSp (VarH y))
headSpine _ (ConstR c)           = Left (NilSp (ConstH c))
headSpine x (AppR r m)           = case headSpine x r of
  Left s -> Left (AppSp s m)
  Right s -> Right (AppSp s m)

Substitution in a kind just carries out substitution in the types. Likewise substutition in normal type families.

substKind :: Fresh m => Term -> Var -> Kind -> m Kind
substKind _ _ TypeK = return TypeK
substKind m x (PiK bnd) = do
  ((y, unembed -> a), k) <- unbind bnd
  a' <- substType m x a
  k' <- substKind m x k
  return $ PiK $ bind (y, embed a') k'

substType :: Fresh m => Term -> Var -> Type -> m Type
substType m x (PT p) = do
  p' <- substP m x p
  return (PT p')
substType m x (PiT bnd) = do
  ((y, unembed -> a), b) <- unbind bnd
  a' <- substType m x a
  b' <- substType m x b
  return $ PiT $ bind (y, embed a') b'

Atomic type family application substitutes a term for a variable in the (normal) index terms.

substP :: Fresh m => Term -> Var -> P -> m P
substP _ _ (AtmP a) = return (AtmP a)
substP m x (AppP p n) = do
  p' <- substP m x p
  n' <- substTerm m x n
  return (AppP p' n')

Normal term substitution goes under a lambda (freshness is ensured by the library) and into the atomic term.

substTerm :: Fresh m => Term -> Var -> Term -> m Term
substTerm m x (RM r) = substR m x r
substTerm m x (LamM bnd) = do
  (y, n) <- unbind bnd
  n' <- substTerm m x n
  return $ LamM $ bind y n'

To substitute in an atomic term we separate the head and the spine and proceed according to whether the variable at the head is the one we are substituting for.

substR :: Fresh m => Term -> Var -> R -> m Term
substR m x r =
  case headSpine x r of
    Left rsp -> RM <$> substRRsp m x rsp
    Right msp -> substMsp m x msp

If there is another variable or a constant at the head, we will get some kind of atomic term out since the head is unchanged and we only substitute into the index terms.

substRRsp :: Fresh m => Term -> Var -> Spine Head -> m R
substRRsp _ _ (NilSp h) = return (headR h)
substRRsp m x (AppSp sp n) = do
  n' <- substTerm m x n
  r <- substRRsp m x sp
  return $ AppR r n'

headR :: Head -> R
headR (VarH y) = VarR y
headR (ConstH c) = ConstR c

When the variable we care about is at the head, we apply the substitution to the rest of the spine to get a normal term (which, by the metatheory will turn out to be some kind of a lambda), apply the substitution to the index, and then carry out a new substitution of the new index for the variable bound by the lambda to the body of the lambda. This last step is the heredetary part of heredetary substitution. The metatheory guarantees that this process will terminate. (Because the simple type of the body of the lambda is smaller than the simple type of the original term).

substMsp :: Fresh m => Term -> Var -> Spine () -> m Term
substMsp m _ (NilSp ()) = return m
substMsp m x (AppSp s n) = do
  o_ <- substMsp m x s
  n' <- substTerm m x n
  case o_ of
    LamM bnd -> do
      (y, o) <- unbind bnd
      substTerm n' y o
    _ -> error "can't happen"

Typechecking

We typecheck in an environment that maps type family atoms, term costants and variables to their respective kinds and types.

data Env = Env { _envAtm :: M.Map Atm Kind,
                 _envConst :: M.Map Cnst Type,
                 _envCtx :: M.Map Var Type }

emptyEnv :: Env
emptyEnv = Env M.empty M.empty M.empty

Some lenses to work with the environment

envAtm :: Lens' Env (M.Map Atm Kind)
envAtm afb s = fmap (\atms -> s { _envAtm = atms }) $ afb (_envAtm s)

envCtx :: Lens' Env (M.Map Var Type)
envCtx afb s = fmap (\ctx -> s { _envCtx = ctx } ) $ afb (_envCtx s)

envConst :: Lens' Env (M.Map Cnst Type)
envConst afb s = fmap (\sig -> s { _envConst = sig} ) $ afb (_envConst s)

And some combinators to perform lookups.

lookupOver :: (MonadReader e m, MonadError String m) => Getting p e p -> String -> (p -> Maybe c) -> m c
lookupOver l s f = do
  mk <- views l f
  case mk of
    Nothing -> throwError $ "unbound " ++ s
    Just c -> return c

lookupAtom :: (MonadReader Env m, MonadError String m) => Atm -> m Kind
lookupAtom = lookupOver envAtm "atom" . M.lookup 

lookupVar :: (MonadReader Env m, MonadError String m) => Var -> m Type
lookupVar = lookupOver envCtx "variable" . M.lookup

lookupConst :: (MonadReader Env m, MonadError String m) => Cnst -> m Type
lookupConst = lookupOver envConst "constant" . M.lookup

To check a signature we check the kind or type classifying the atom or constant and then continue in an environment extended with the new binding.

withSigOk :: (Fresh m, MonadReader Env m, MonadError String m) => Signature -> m r -> m r
withSigOk NilS kont = kont
withSigOk (SnocAtom (unrebind -> (s, (a, unembed -> k)))) kont =
  withSigOk s $ do
  local (set envCtx M.empty) $ wfk k
  local (over envAtm (M.insert a k)) kont
withSigOk (SnocConst (unrebind -> (s, (c, unembed -> a)))) kont =
  withSigOk s $ do
  local (set envCtx M.empty) $ wfType a
  local (over envConst (M.insert c a)) kont

Kind and normal type formation is unsurprising. Note that PT is only well-formed when the atomic type family is fully applied and is of base kind.

wfk :: (Fresh m, MonadReader Env m, MonadError String m) => Kind -> m ()
wfk TypeK = return ()
wfk (PiK bnd) = do
  ((x, unembed -> t), k) <- unbind bnd
  wfType t
  local (over envCtx (M.insert x t)) $ wfk k

wfType :: (Fresh m, MonadReader Env m, MonadError String m) => Type -> m ()
wfType (PT p) = do
  k <- inferP p
  case k of
    TypeK -> return ()
    _ -> throwError "expected a type"
wfType (PiT bnd) = do
  ((x, unembed -> a), b) <- unbind bnd
  wfType a
  local (over envCtx (M.insert x a)) $ wfType b

For atomic type families we infer their kinds. For atoms we read the kind off from the environment. For applications we infer the kind of the type family (which had better be a pi kind), and then check that the term argument has the expected type and then return the resulting kind where we (heredeterily) substitute the term for the index variable.

inferP :: (Fresh m, MonadReader Env m, MonadError String m) => P -> m Kind
inferP (AtmP atm) = lookupAtom atm
inferP (AppP p m) = do
  k <- inferP p
  case k of
    TypeK -> throwError "expected a pi kind"
    PiK bnd -> do
      ((x, unembed -> a), k') <- unbind bnd
      checkTerm m a
      substKind m x k'

To check that a term in normal form has the expected normal type, we check that its either a lambda of pi type, or an atomic term of atomic type. The latter ensures that terms are in eta long form by requiring all variables and constants to be fully applied.

We infer the type of an atomic term (which had better be atomic) and then check that it is alpha-equivalent to the given atomic type. Because the calculus is constructed to only allow terms in normal form, alpha equivalence suffices and we don't have to do any normalization. (We paid that price in heredetary substitution.)

checkTerm :: (Fresh m, MonadReader Env m, MonadError String m) => Term -> Type -> m ()
checkTerm (LamM bnd) (PiT bnd') = do
  mmatch <- unbind2 bnd bnd'
  case mmatch of
    Just (x, m, (_, unembed -> a), b) -> do
      wfType a
      local (over envCtx (M.insert x a)) $ checkTerm m b
    Nothing -> throwError "did not match"
checkTerm (RM r) (PT p) = do
  t <- inferTerm r
  case t of
    (PT p') | p `aeq` p' -> return ()
    _ -> throwError "atomic term doesn't have the expected atomic type."
checkTerm (LamM {}) _ = throwError "lambda with no-PI type"
checkTerm (RM {}) _ = throwError "atomic term with non-atomic type"

To infer the type of a term, we lookup variables and constants in the environment. For applications we ensure that the head has some kind of pi type and then check the index against the argument type, and then return the result type with the argument substituted for the index variable.

inferTerm :: (Fresh m, MonadReader Env m, MonadError String m) => R -> m Type
inferTerm (VarR x) = lookupVar x
inferTerm (ConstR c) = lookupConst c
inferTerm (AppR r m) = do
  p_ <- inferTerm r
  case p_ of
    PiT bnd -> do
      ((x, unembed -> a), b) <- unbind bnd
      checkTerm m a
      substType m x b
    PT {} -> throwError "expected a function in application position"

Smart Constructors

This section defines a little DSL for writing Canonical LF terms in Haskell. It's a higher-order encoding that uses haskell variable binding to represent LF binding constructs.

We use a higher-kinded repr parameter to allow for different sorts of interpretations for this DSL. Although in this example we only build one using the Syn newtype to wrap a fresh-name monad computation that just builds a term in our original AST, above.

class TermSyntax repr r n | r -> n, n -> r where
  lam :: String -> ((repr r) -> (repr n)) -> (repr n)
  app :: repr r -> repr n -> repr r
  rm :: repr r -> repr n

newtype Syn m a = Syn { unSyn :: m a } 

instance Fresh m => TermSyntax (Syn m) R Term where
  lam hint f = Syn $ do
    x <- fresh (s2n hint)
    m <- unSyn $ f (Syn $ return $ VarR x)
    return $ LamM (bind x m)
  app r n = Syn (AppR <$> unSyn r <*> unSyn n)
  rm r = Syn (RM <$> unSyn r)

class  TypeSyntax repr p a | a -> p, p -> a where
  type TermInType repr a :: *
  type RInType repr a :: *
  piT :: String -> repr a -> (repr (RInType repr a) -> repr a) -> repr a
  arrT :: repr a -> repr a -> repr a
  arrT a b = piT "_" a (const b)
  appP :: repr p -> repr (TermInType repr a) -> repr p
  pt :: repr p -> repr a

instance Fresh m => TypeSyntax (Syn m) P Type where
  type TermInType (Syn m) Type = Term
  type RInType (Syn m) Type = R
  piT hint sa f = Syn $ do
    x <- fresh (s2n hint)
    b <- unSyn $ f (Syn $ return $ VarR x)
    a <- unSyn sa
    return $ PiT $ bind (x, embed a) b
  appP p m = Syn (AppP <$> unSyn p <*> unSyn m)
  pt p = Syn (PT <$> unSyn p)

class KindSyntax repr k where
  type TypeInKind repr k :: *
  type RInKind repr k :: *
  typeK :: repr k
  piK :: String -> repr (TypeInKind repr k) -> (repr (RInKind repr k) -> repr k) -> repr k
  arrK :: repr (TypeInKind repr k) -> repr k -> repr k
  arrK a k = piK "_" a (const k)

instance Fresh m => KindSyntax (Syn m) Kind where
  type TypeInKind (Syn m) Kind = Type
  type RInKind (Syn m) Kind = R
  typeK = Syn $ return TypeK
  piK hint sa sk = Syn $ do
    x <- fresh (s2n hint)
    a <- unSyn sa
    k <- unSyn $ sk (Syn $ return $ VarR x)
    return $ PiK $ bind (x, embed a) k

class SignatureSyntax repr sig p r | sig -> p r where
  type KindInSig repr sig :: *
  type TypeInSig repr sig :: *
  letAtom :: String -> repr (KindInSig repr sig) -> (repr p -> repr sig) -> repr sig
  letConstant :: String -> repr (TypeInSig repr sig) -> (repr r -> repr sig) -> repr sig
  endSig :: repr sig

instance Fresh m => SignatureSyntax (Syn m) (Signature -> Signature) P R where
  type KindInSig (Syn m) (Signature -> Signature) = Kind
  type TypeInSig (Syn m) (Signature -> Signature) = Type

  letAtom hint sk kont = Syn $ do
    a <- fresh (s2n hint)
    k <- unSyn sk
    f <- unSyn $ kont $ Syn $ return $ AtmP a
    return $ \sig -> f (SnocAtom $ rebind sig (a, embed k))

  letConstant hinst st kont = Syn $ do
    c <- fresh (s2n hinst)
    t <- unSyn st
    f <- unSyn $ kont $ Syn $ return $ ConstR c
    return $ \sig -> f (SnocConst $ rebind sig (c, embed t))

  endSig = Syn $ return id

infixr 6 `arrT`, `arrK`
infixl 6 `appP`, `app`

Example

An LF signature fragment for first-order logic.

example1 :: Fresh m => Syn m (Signature -> Signature)
example1 =
  letAtom "o" typeK $ \o ->
  letConstant "tt" (pt o) $ \tt ->
  letConstant "ff" (pt o) $ \ff ->
  letConstant "not" (pt o `arrT` pt o) $ \not ->
  letConstant "and" (pt o `arrT` pt o `arrT` pt o) $ \and ->
  letAtom "nd" (pt o `arrK` typeK) $ \nd ->
  letConstant "tti" (pt (nd `appP` rm tt)) $ \tti ->
  letConstant "ffe" (piT "a" (pt o) $ \a ->
                       pt (nd `appP` rm ff)
                       `arrT`
                       pt (nd `appP` rm a)) $ \ffe ->
  letConstant "noti" (piT "a" (pt o) $ \a ->
                         (piT "p" (pt o) $ \p ->
                             pt (nd `appP` rm a)
                             `arrT`
                             pt (nd `appP` rm p))
                         `arrT`
                         pt (nd `appP` rm (not `app` rm a))) $ \noti ->
  letConstant "note" (piT "a" (pt o) $ \a -> piT "c" (pt o) $ \c ->
                         pt (nd `appP` rm (not `app` rm a))
                         `arrT`
                         pt (nd `appP` rm a)
                         `arrT`
                         pt (nd `appP` rm c)) $ \note ->
  letConstant "andi" (piT "a" (pt o) $ \a -> piT "b" (pt o) $ \b ->
                         pt (nd `appP` rm a)
                         `arrT`
                         pt (nd `appP` rm b)
                         `arrT`
                         pt (nd `appP` rm (and `app` rm a `app` rm b))) $ \andi ->
  letConstant "ande1" (piT "a" (pt o) $ \a -> piT "b" (pt o) $ \b ->
                          pt (nd `appP` rm (and `app` rm a `app` rm b))
                          `arrT`
                          pt (nd `appP` rm a)) $ \ande1 ->
  letConstant "ande2" (piT "a" (pt o) $ \a -> piT "b" (pt o) $ \b ->
                          pt (nd `appP` rm (and `app` rm a `app` rm b))
                          `arrT`
                          pt (nd `appP` rm b)) $ \ande2 ->
  endSig

checkSynSig :: (Fresh m, MonadReader Env m, MonadError String m) => Syn m (Signature -> Signature) -> m ()
checkSynSig sig = do
  sigf <- unSyn sig
  withSigOk (sigf NilS) $ return ()

Example:

  >>> runExceptT $ runFreshMT $ runReaderT (checkSynSig example1) emptyEnv
  Right ()

Canonical LF using unbound-generics

Aleksey Kliger

June 2016