Nifty - Part 4: Type Parsing

In Part 3 we finished off most of the syntax analysis that we’ll need to create a basic LLVM front-end for Swift. This week, we’ll take a brief look at type parsing, so that we can represent richer, compound types like those that represent functions and tuples. Some features, like tuples, won’t have full code generation support as we continue on, but parsing them is a simple exercise. As usual, feature parity with the Swift compiler isn’t a goal, so we’ll just focus on the more interesting examples.

Subclassing SwiftType

So far, the only type we could represent was a named type (a SwiftTypeIdentifier). This is a scalar type, representing the type of single value or reference, or Void. We need two more classes of types. The first is a function type. For example:

let foo: Int -> String

Representing function types allows us to be able to store and pass closures around. Next, we’ll need a tuple type:

let bar: (Int, String)

We might even need to combine the two at times:

let baz: (Int, String) -> String

In fact, we can combine these types in any number of ways:

let qux: (((Int, String) -> String), String)

Although you might read the type of baz to be a function which takes an Int and a String, it’s equivalent to a function which takes a tuple of (Int, String). The following is perfectly valid Swift:

let oneAndTwo = (1, 2)
func add (a: Int, b: Int) -> Int {
    return a + b
}
add(oneAndTwo)

First, we’ll add a new subclass for function types:

// A function type.
public class SwiftFunctionType: SwiftType {
    var parameterType: SwiftType
    var returnType: SwiftType
    
    /**
    Initialises a SwiftFunctionType node.
    
    :param: parameterType The parameter type.
    :param: returnType The returnType type.
    :param: lineContext An LineContext relating to the node.
    
    :returns: An initialised SwiftFunctionType.
    **/
    init(parameterType: SwiftType, returnType: SwiftType, lineContext: LineContext?) {
        self.parameterType = parameterType
        self.returnType = returnType
        super.init(lineContext: lineContext)
    }
    ...
}

Similarly, we’ll add an AST node for a tuple types.

// A parameter list type, used for multiple return values and parameter lists in function types.
public class SwiftTupleType: SwiftType {
    var types: [SwiftType]
    
    /**
    Initialises a SwiftTupleType node.
    
    :param: types A list of SwiftType objects.
    :param: lineContext An optional LineContext relating to the node.
    
    :returns: An initialised SwiftTupleType.
    **/
    init (types: [SwiftType], lineContext: LineContext?) {
        self.types = types
        super.init(lineContext: lineContext)
    }
    ...
}

Parsing function types

First up, function types. We start off at the parseType() function.

/**
Parses a type.

:returns: A SwiftType if no errors occured, nil otherwise.
**/
func parseType() -> SwiftType? {
  switch tokens[0] {
  case .Identifier(let t):
      let context = self.lineContext[0]
      consumeToken()
      return parseTypeRHS(lhs: SwiftTypeIdentifier(identifier: t, lineContext: context), isBracketed: false)
  default:
      return nil
  }
}

Note that now we delegate to parseTypeRHS() to handle parsing the remainder of a function type.

/**
Parses the right hand side of a type.

:param: lhs The left hand side of the type.
:param: isBracketed Causes the type to be parsed differently if the left hand side includes a bracket.

:returns: A SwiftType if no errors occured, nil otherwise.
**/
func parseTypeRHS(#lhs: SwiftType) -> SwiftType? {
  if !tokens.isEmpty {
      switch tokens[0] {
      case .Returns:
          if let type = parseRHSFunctionType(parameterType: lhs) {
              return parseTypeRHS(lhs: type)
          } else {
              return nil
          }
      default:
          return lhs
      }
  }
  return lhs
}

parseTypeRHS() is a recursive function, similar to those encountered in the binary precedence parsing section of Part 3. It delegates to parseFunctionTypeRHS() to parse the remainder of a function if the arrow (->) symbol is encountered.

/**
Parses the right hand side of a SwiftFunctionType.

:param: parameterType The parameter type of the function.
:param: isBracketed Causes the type to be parsed differently if the left hand side includes a bracket.

:returns: A SwiftFunctionType if no errors occured, nil otherwise.
**/
func parseFunctionTypeRHS(#parameterType: SwiftType) -> SwiftType? {
  var type: SwiftFunctionType!
  while true {
      if tokens.isEmpty {
          return type
      }
      switch tokens[0] {
      case .Returns:
          consumeToken()
      case .RightBracket:
          return type
      default:
          if let t = parseType() {
              type = SwiftFunctionType(parameterType: parameterType, returnType: t)
          } else {
              return nil
          }
      }
  }
}

From The Swift Programming Language, “The function types of a curried function are grouped from right to left.”. A curried function is essentially a function which takes an argument and returns another function. So the given type below can be thought of as taking an Int, and returning a function which takes a String and returns a Bool.

let foo: Int -> String -> Bool

That is to say, the following statement is also equivalent:

let foo: Int -> (String -> Bool)

Because the -> operator is right-associative, parseRHSFunctionType() delegates back to parseType() to gobble up the remaining types in a way that conforms to this behaviour.

Parsing tuple types

Parsing tuple types from here is easy, we just need to parse comma-separated values within brackets, representing these types with SwiftTupleType objects. First, some changes to parseType().

func parseType() -> SwiftType? {
  switch tokens[0] {
  case .Identifier(let t):
      let context = self.lineContext[0]
      consumeToken()
      return parseTypeRHS(lhs: SwiftTypeIdentifier(identifier: t, lineContext: context), isBracketed: false)
  case .Void:
      let context = self.lineContext[0]
      consumeToken()
      // Void is equivalent to the empty tuple ().
      return parseTypeRHS(lhs: SwiftTupleType(types: [], lineContext: context), isBracketed: false)
  case .LeftBracket:
      let typeContext = self.lineContext[0]
      consumeToken()
      let t = parseType() ?? SwiftTupleType(types: [], lineContext: typeContext)
      if let type = parseTypeRHS(lhs: t, isBracketed: true) {
          return parseTypeRHS(lhs: type)
      } else {
          return nil
      }
  default:
      return nil
  }
}

When encountering a left bracket, we parse the type. If it was nil, the type is () or Void. Otherwise, we continue to parse the RHS, this time passing a boolean value representing the fact that this type is bracketed and thus is a tuple type. The other case simply checks for a Void token, which is a equivalent to an empty tuple. Now, we just need a few changes to parseTypeRHS().

func parseTypeRHS(#lhs: SwiftType, isBracketed bracketed: Bool = false) -> SwiftType? {
  if !tokens.isEmpty {
      switch tokens[0] {
      // Close a tuple.
      case .RightBracket where bracketed:
          consumeToken()
          return lhs
      // Parse the list of types in a TupleType.
      case .Comma where bracketed:
          if let type = parseRHSTupleType(lhs) {
              return parseTypeRHS(lhs: type, isBracketed: bracketed)
          } else {
              return nil
          }
      case .Returns:
          if let type = parseRHSFunctionType(parameterType: lhs, isBracketed: bracketed) {
              return parseTypeRHS(lhs: type)
          } else {
              return nil
          }
      default:
          if bracketed {
              errors.append(SCError(message: "Missing expected ')'.", lineContext: self.lineContext[0]))
              return nil
          } else {
              return lhs
          }
      }
  }
  return lhs
}

We simply check for commas if the expression is bracketed, and delegate to parseRHSTupleType(), with error checking sprinkled in for good measure. parseRHSTupleType() is probably going to look pretty familiar by now, as it simply parses a list of types. We’ve handled parsing lists when we looked at function declarations, and this is no different.

/**
Parses the right hand side of a SwiftTupleType.

:param: lhs The left hand side of the SwiftTupleType.

:returns: A SwiftTupleType if no errors occured, nil otherwise.
**/
func parseRHSTupleType(lhs: SwiftType) -> SwiftTupleType? {
  var types: [SwiftType] = [lhs]
  let typeContext = self.lineContext[0]
  while true {
      switch tokens[0] {
      case .Comma:
          consumeToken()
      case .RightBracket:
          return SwiftTupleType(types: types, lineContext: typeContext)
      default:
          if let type = parseType() {
              types.append(type)
          } else {
              return nil
          }
      }
  }
}

Wrapping things up

With these changes in place, we can handle types with arbitrary complexity. For example, given:

Int -> (String, Int -> Bool) -> Double

Our AST (with pretty printing) spits out the following:

(Int -> ((String, (Int -> Bool)) -> Double))

There are plenty of other examples available in TestTypeParsing.swift.

We’ve spent a lot of time building up the AST generator. Next time we’ll take a look at the semantic analysis of a program, which includes basic type inference and checking.