Ever since Swift adopted the “implement first, propose second” rule, some contributors to the Swift Evolution process have felt sidelined and dropped out of the community. It’s frustrating having ideas for positive language changes and not being able to implement those changes.
Despite expertise in Swift and Objective-C (not to mention any number of other languages), they like me may not be proficient in C++ and Python, the core tools of Swift language implementation. To date, my code contributions to Swift have been extremely low level.
I think I fixed a comment, added a string, and maybe one or two other tiny tweaks. (I did work on the Changelog file a while back but that is written in Common Mark, and does not involve programming in the slightest.)
I’ve wanted to be able to build what I can dream. And I’ve slowly been diving into the compiler in recent months to see what it takes to build something new. With quite a lot of hand holding from John Holdsworth, I implemented a couple of build directives to test whether (1) asserts can fire and (2) a build is optimized.
What is Debug?
Answering “what does ‘debug’ mean?” was a harder question than I initially thought. Coming as I do from Xcode, where ‘debug’ means a scheme I select from within the IDE, it took a bit of thinking and advice to think about ‘debug’ from a platform independent viewpoint. After going back and forth on the Swift Evolution email list and later the forums, the consensus centered on the two tests I mentioned above: assertions and optimization.
For many projects, a typical debug build is unoptimized where asserts can fire. As projects move into the beta process, that mindset changes. Many in-house and beta builds meant for wider use need optimization.
The state of the art uses the custom conditional compilation flags set with -D
. This approach decouples the meaning of ‘debug’ (or for most developers 'DEBUG'
) from anything in-language that can be decoupled from build settings and persist through source code. Assert configurations have their own flag, -assert-config <#value#>
.
Introducing these two tests lets you align your not-for-release code to assert states and/or optimization states:
#if !configuration(optimized) // code for un-optimized builds only #endif #if configuration(assertsWillFire) // code where assertions can fire #endif
The proof-of-concept implementation coupled with a proposal means I may be able to submit a more substantial and meaningful contribution to the language.
Going Solo
Pushing forward, I wanted to test myself and check whether I could make changes on my own, even if that solo journey was quite small. I started by attempting to build the #exit
directive that was being discussed on the forums. This turned out to be a little more complicated than I was ready for.
Among other challenges, #exit
may or may not be embedded in an #if
build configuration test. Using #exit
should allow all material up until that point to be compiled into the build while gracefully excluding all material after. I didn’t know how to check whether the directive was embedded in a condition and how to properly complete the condition (#endif
) while discarding remaining text. It was, at my stage of the journey, a step too far.
I put my first attempt to the side and tried something else. I tried to push a scope using with(value) { }
, so the material within the scope was native to value
. That too proved too difficult without assistance although I am beginning to understand how Swift creates and manages its scope. It was a programming failure but a learning success.
Two projects abandoned, I knew I had to pick something very easy to work with. Although I would have loved to have picked up and run with Dave DeLong’s context
pitch (which is discussed here) , I recognized that I needed to bite off something smaller first. So I decided to add a #dogcow
token that produces the string value `”????????”` in source. How difficult could that be, right?
About five hours and edits to twenty-one files later, I had it working. Kind of. Because I ran into one of the many frequent headdesk situations that plague Swift compiler development. I had focused on my edits to the most recent repo without rebuilding the supporting tool suite.
Ninja Builds
A ninja build is a quick way to build just the compiler component. But at some point you can’t ninja your way into an entire working toolchain. I couldn’t test my changes until I rebuilt everything, a process that can take many many hours on my Mac:
% ./swift ~/Desktop/test.swift :0: error: module file was created by an older version of the compiler; rebuild 'Swift' and try again: /Volumes/MusicAndData/BookWriting/LiveGithub/Apple/appleswift/build/Ninja-ReleaseAssert/swift-macosx-x86_64/lib/swift/macosx/x86_64/Swift.swiftmodule
Argh.
Building the compiler is not a quick thing. Even a ninja build is a non-trivial investment of time. So if you want to be completely honest, the total coding and build time a lot longer than just five hours. A lot longer.
Make sure you’ve updated your repo, incorporated the latest changes for swift and all of its support tools, and built them all before working on your branches. It will save you a lot of frustration.
Be aware that in the time it takes to create even a small project, you’ll probably be out of date with master. Hopefully this won’t affect your changes, and the files you think you’re patching are still the right files you should be patching.
Designing My Changes
The best way to add anything to the Swift compiler is to find some construct that has already been contributed and look through pull releases to discover what parts of the language they had touched.
That’s how I got started with my assertions/optimization revision. I looked at the recent canImport()
pull, and targeted the seven files that involved. In the end, I only needed to modify four files in total, excluding tests. It was a fairly “clean” and simple update.
To add #dogcow
, again excluding tests, I had to change nearly two dozen files, most of them written in C++, a few using Python and Swift’s own gyb
(aka “generate your boilerplate”) macros.
I’ve put up a gist that details my notes as I performed file edits. (I did have to make some further changes once I started testing.) Each group consists of a file name followed by my changes, with some context around them to make it easier to return to those parts of the file.
That’s followed by any relevant grep
results that encouraged me to edit the file in question plus error messages from the compiler, of which there were a few, as I made several typos and forgot at times to add semicolons to the ends of lines. (Damn you semicolons! shakes fist) I put ###
markers near the errors to make them easier to find in the notes.
As you walk through my notes, you’ll notice that I had to create a token (pound_dogcow
), which is a kind of MagicIdentifierLiteralExpr
expression. By inserting a simple token without arguments and returning a string, I cut down on my need to parse additional components or produce a complicated return value.
(Sorry Dave! I’ll get there I hope… After all, I know where each of the five components of Context
live: file
, line
, column
, function
, and dsohandle
. I just don’t know how to build and export the struct so that it gets put into place and can be consumed by the Swift user.)
As a string, my #dogcow
can be used as a literal, so I conformed it to KnownProtocolKind::ExpressibleByStringLiteral
. It needed to be serialized and deserialized, emit its custom string, support code completion, and more. Scrolling down my file, you’ll see the scope of notes including searches, comments, and edits for this one change.
Debugging
One of the most interesting things that happened during this exercise was when I made an actual logic error, not a syntax error, so Swift compiled but my program failed:
Assertion failed: (isString() && "Magic identifier literal has non-string encoding"), function setStringEncoding, file /Users/ericasadun/github/Apple/appleswift/swift/include/swift/AST/Expr.h, line 1052.
For the longest time I was convinced (wrongly) that because I was using Unicode, that I had somehow screwed up the string encoding. This was actually a coding mistake, an actual bug, and had nothing to do with the way my string was built nor the fact that I used emojis. It took a while to track down because my head was in the wrong place.
Notice I have DogCow returning true here. I accidentally swapped the two lines so it was originally returning false, falling into the Line/Column/DSOHandle case.
bool isString() const { switch (getKind()) { case File: case Function: case DogCow: // it's a string! return true; // it should not have been down here case Line: case Column: case DSOHandle: return false; } llvm_unreachable("bad Kind");
Proof-of-Concept
Once compiled, I used a few simple calls to test my work. Here’s the source code I used. I accidentally added an extra space in the assignment test. You can see in the screenshot as well:
// String interpolation and default argument func hello(_ greetedParty: String = #dogcow) { print("Hello \(greetedParty)") } hello() hello("there") // Use in assignment let aDogCow = #dogcow print("The value is ", aDogCow) // Use directly in statement print(#dogcow)
Lessons Learned
Having built a working version of the compiler incorporating my solo changes, no matter how trivial and yes it was extremely trivial, has been a big confidence builder. Exploring the process from consuming tokens to emitting intermediate language representations has enlightening.
- I learned to update everything and build from scratch before starting my work. Because if you don’t, you’ll end up doing it later and wasting that time.
- I learned how to track down similar modifications and use them as a template for exploring what parts of the compiler each change touched.
- I learned that some errors would not be in the compilation but in the testing, as one tends to forget things like “just because it built doesn’t mean it will compile correctly” when one is very very focused on getting things to run and extremely new to the process.
I have now worked on two (technically three) compiler modification projects. Each has taught me something new. If you’d like, take a peek at some explorations I’ve pushed to my forked repo:
The DogCow changes are clean, in the style of something that I might actually do a pull request for. The optimization checks are not. They retain all my little in-line notes I use for searching through text files to find what I’ve changed.
The early debug checks represent the time before I could get all the compiler tools built on my system. I was basically programming in my head at that point, guessing what would work or not, before the conversation on Swift forums moved me to my current design.
My guesswork was wrong. I focused on using a trio of built-in functions (like _isDebugAssertConfiguration
) mentioned on-list. This turned out to be a non-viable solution. I needed to follow the example set by canImport
to set my flags.
Finally, a word to the wise: Don’t ./utils/build-script -R
in one terminal window and ninja swift
in another at the same time. Oops.
Comments are closed.