SwiftSyntax is a Swift library that lets you parse, analyze, generate, and transform Swift source code. It’s based on the libSyntax library, and was spun out from the main Swift language repository in August 2017.
Together, the goal of these projects is to provide safe, correct, and intuitive facilities for structured editing, which is described thusly:
What is structured editing? It’s an editing strategy that is keenly aware of the structure of source code, not necessarily its representation (i.e. characters or bytes). This can be achieved at different granularities: replacing an identifier, changing a call to global function to a method call, or indenting and formatting an entire source file based on declarative rules.
At the time of writing, SwiftSyntax is still in development and subject to API changes. But you can start using it today to work with Swift source code in a programmatic way.
It’s currently used by the Swift Migrator, and there are ongoing efforts to adopt the tool, both internally and externally.
How Does It Work?
To understand how SwiftSyntax works, let’s take a step back and look at the Swift compiler architecture:
The Swift compiler is primarily responsible for turning Swift code into executable machine code. The process is divided up into several discrete steps, starting with the parser, which generates an abstract syntax tree, (ast). From there, semantic analysis is performed on the syntax to produce a type-checked AST, which lowered into Swift Intermediate Language; the sil is transformed and optimized and itself lowered into LLVM IR, which is ultimately compiled into machine code.
The most important takeaway for our discussion is that SwiftSyntax operates on the AST generated at the first step of the compilation process. As such, it can’t tell you any semantic or type information about code.
Contrast this with something like SourceKit, which operates with a much more complete understanding of Swift code. This additional information can be helpful for implementing editor features like code-completion or navigating across files. But there are plenty of important use cases that can be satisfied on a purely syntactic level, such as code formatting and syntax highlighting.
Demystifying the AST
Abstract syntax trees can be difficult to understand in the abstract. So let’s generate one and see what it looks like.
Consider the following single-line Swift file, which declares a function named one()
that returns the value 1
:
func one() -> Int { return 1 }
Run the swiftc
command on this file passing the -frontend -emit-syntax
arguments:
$ xcrun swiftc -frontend -emit-syntax ./One.swift
The result is a chunk of JSON representing the AST. Its structure becomes much clearer once you reformat the JSON:
{
"kind": "SourceFile",
"layout": [{
"kind": "CodeBlockItemList",
"layout": [{
"kind": "CodeBlockItem",
"layout": [{
"kind": "FunctionDecl",
"layout": [null, null, {
"tokenKind": {
"kind": "kw_func"
},
"leadingTrivia": [],
"trailingTrivia": [{
"kind": "Space",
"value": 1
}],
"presence": "Present"
}, {
"tokenKind": {
"kind": "identifier",
"text": "one"
},
"leadingTrivia": [],
"trailingTrivia": [],
"presence": "Present"
}, ...
The Python json.tool
module offers a convenient way to format JSON. It comes standard in macOS releases going back as far as anyone can recall. For example, here’s how you could use it with the redirected compiler output:
$ xcrun swiftc -frontend -emit-syntax ./One.swift | python -m json.tool
At the top-level, we have a SourceFile
consisting of CodeBlockItemList
elements and their constituent CodeBlockItem
parts. This example has a single CodeBlockItem
for the function declaration (FunctionDecl
), which itself comprises subcomponents including a function signature, parameter clause, and return clause.
The term trivia is used to describe anything that isn’t syntactically meaningful, like whitespace. Each token can have one or more pieces of leading and trailing trivia. For example, the space after the Int
in the return clause (-> Int
) is represented by the following piece of trailing trivia.
{
"kind": "Space",
"value": 1
}
Working Around File System Constraints
SwiftSyntax generates abstract syntax trees by delegating system calls to swiftc
. However, this requires code to be associated with a file in order to be processed, and it’s often useful to work with code as a string.
One way to work around this constraint is to write code to a temporary file and pass that to the compiler.
We’ve written about temporary files in the past, but nowadays, there’s a much nicer API for working with them that’s provided by the Swift Package Manager itself. In your Package.swift
file, add the following package dependency, and add the "Utility"
dependency to the appropriate target:
.package(url: "https://github.com/apple/swift-package-manager.git", from: "0.3.0"),
Now, you can import the Basic
module and use its TemporaryFile
API like so:
import Basic
import Foundation
let code: String
let tempfile = try TemporaryFile(deleteOnClose: true)
defer { tempfile.fileHandle.closeFile() }
tempfile.fileHandle.write(code.data(using: .utf8)!)
let url = URL(fileURLWithPath: tempfile.path.asString)
let sourceFile = try SyntaxTreeParser.parse(url)
What Can You Do With It?
Now that we have a reasonable idea of how SwiftSyntax works, let’s talk about some of the ways that you can use it!
Writing Swift Code: The Hard Way
The first and least compelling use case for SwiftSyntax is to make writing Swift code an order of magnitude more difficult.
SwiftSyntax, by way of its SyntaxFactory
APIs, allows you to generate entirely new Swift code from scratch. Unfortunately, doing this programmatically isn’t exactly a walk in the park.
For example, consider the following code:
import SwiftSyntax
let structKeyword = SyntaxFactory.makeStructKeyword(trailingTrivia: .spaces(1))
let identifier = SyntaxFactory.makeIdentifier("Example", trailingTrivia: .spaces(1))
let leftBrace = SyntaxFactory.makeLeftBraceToken()
let rightBrace = SyntaxFactory.makeRightBraceToken(leadingTrivia: .newlines(1))
let members = MemberDeclBlockSyntax { builder in
builder.useLeftBrace(leftBrace)
builder.useRightBrace(rightBrace)
}
let structureDeclaration = StructDeclSyntax { builder in
builder.useStructKeyword(structKeyword)
builder.useIdentifier(identifier)
builder.useMembers(members)
}
print(structureDeclaration)
Whew. So what did all of that effort get us?
struct Example {
}
Oofa doofa.
This certainly isn’t going to replace GYB for everyday code generation purposes. (In fact, libSyntax and SwiftSyntax both make extensive use of gyb
to generate its interfaces.)
But this interface can be quite useful when precision matters. For instance, you might use SwiftSyntax to implement a fuzzer for the Swift compiler, using it to randomly generate arbitrarily-complex-but-ostensibly-valid programs to stress test its internals.
Rewriting Swift Code
The example provided in the SwiftSyntax README shows how to write a program to take each integer literal in a source file and increment its value by one.
Looking at that, you can already extrapolate out to how this might be used to create a canonical swift-format
tool. But for the moment, let’s consider a considerably less productive — and more seasonally appropriate (🎃) — use of source rewriting:
import SwiftSyntax
public class ZalgoRewriter: SyntaxRewriter {
public override func visit(_ token: TokenSyntax) -> Syntax {
guard case let .stringLiteral(text) = token.tokenKind else {
return token
}
return token.withKind(.stringLiteral(zalgo(text)))
}
}
What’s that zalgo
function all about? You’re probably better off not knowing…
Anyway, running this rewriter on your source code transforms all string literals in the following manner:
// Before 👋😄
print("Hello, world!")
// After 🦑😵
print("H͞͏̟̂ͩel̵ͬ͆͜ĺ͎̪̣͠ơ̡̼͓̋͝, w͎̽̇ͪ͢ǒ̩͔̲̕͝r̷̡̠͓̉͂l̘̳̆ͯ̊d!")
Spooky, right?
Highlighting Swift Code
Let’s conclude our look at SwiftSyntax with something that’s actually useful: a Swift syntax highlighter.
A syntax highlighter, in this sense, describes any tool that takes source code and formats it in a way that’s more suitable for display in HTML.
NSHipster is built on top of Jekyll, and uses the Ruby library Rouge to colorize the example code you see in every article. However, due to Swift’s relatively complex syntax and rapid evolution, the generated HTML isn’t always 100% correct.
Instead of messing with a pile of regular expressions, we could instead build a syntax highlighter that leverages SwiftSyntax’s superior understanding of the language.
At its core, the implementation is rather straightforward: implement a subclass of SyntaxRewriter
and override the visit(_:)
method that’s called for each token as a source file is traversed. By switching over each of the different kinds of tokens, you can map them to the HTML markup for their corresponding highlighter tokens.
For example, numeric literals are represented with <span>
elements whose class name begins with the letter m
(mf
for floating-point, mi
for integer, etc.). Here’s the corresponding code in our SyntaxRewriter
subclass:
import SwiftSyntax
class SwiftSyntaxHighlighter: SyntaxRewriter {
var html: String = ""
override func visit(_ token: TokenSyntax) -> Syntax {
switch token.tokenKind {
…
case .floatingLiteral(let string):
html += "<span class=\"mf\">\(string)</span>"
case .integerLiteral(let string):
if string.hasPrefix("0b") {
html += "<span class=\"mb\">\(string)</span>"
} else if string.hasPrefix("0o") {
html += "<span class=\"mo\">\(string)</span>"
} else if string.hasPrefix("0x") {
html += "<span class=\"mh\">\(string)</span>"
} else {
html += "<span class=\"mi\">\(string)</span>"
}
…
default:
break
}
return token
}
}
Although SyntaxRewriter
has specialized visit(_:)
methods for each of the different kinds of syntax elements, I found it easier to handle everything in a single switch
statement. (Printing unhandled tokens in the default
branch was a really helpful way to find any cases that I wasn’t already handling). It’s not the most elegant of implementations, but it was a convenient place to start given my limited understanding of the library.
Anyway, after a few hours of development, I was able to generate reasonable colorized output for a wide range of Swift syntactic features:
The project comes with a library and a command line tool. Go ahead and try it out and let me know what you think!