Archive for the ‘Tricks of the Trade’ Category

Writing Swift: Adventures in Compiler Mods

Ever since Swift adopted the “implement first, propose second” rule, some contributors to the Swift Evolution process have felt sidelined and dropped out of the community. It’s frustrating having ideas for positive language changes and not being able to implement those changes.

Despite expertise in Swift and Objective-C (not to mention any number of other languages), they like me may not be proficient in C++ and Python, the core tools of Swift language implementation. To date, my code contributions to Swift have been extremely low level.

I think I fixed a comment, added a string, and maybe one or two other tiny tweaks. (I did work on the Changelog file a while back but that is written in Common Mark, and does not involve programming in the slightest.)

I’ve wanted to be able to build what I can dream. And I’ve slowly been diving into the compiler in recent months to see what it takes to build something new. With quite a lot of hand holding from John Holdsworth, I implemented a couple of build directives to test whether (1) asserts can fire and (2) a build is optimized.

What is Debug?

Answering “what does ‘debug’ mean?” was a harder question than I initially thought. Coming as I do from Xcode, where ‘debug’ means a scheme I select from within the IDE, it took a bit of thinking and advice to think about ‘debug’ from a platform independent viewpoint. After going back and forth on the Swift Evolution email list and later the forums, the consensus centered on the two tests I mentioned above: assertions and optimization.

For many projects, a typical debug build is unoptimized where asserts can fire. As projects move into the beta process, that mindset changes. Many in-house and beta builds meant for wider use need optimization.

The state of the art uses the custom conditional compilation flags set with -D . This approach decouples the meaning of ‘debug’ (or for most developers 'DEBUG') from anything in-language that can be decoupled from build settings and persist through source code. Assert configurations have their own flag, -assert-config <#value#>.

Introducing these two tests lets you align your not-for-release code to assert states and/or optimization states:

#if !configuration(optimized)
    // code for un-optimized builds only
#endif

#if configuration(assertsWillFire)
    // code where assertions can fire
#endif

The proof-of-concept implementation coupled with a proposal means I may be able to submit a more substantial and meaningful contribution to the language.

Going Solo

Pushing forward, I wanted to test myself and check whether I could make changes on my own, even if that solo journey was quite small. I started by attempting to build the #exit directive that was being discussed on the forums. This turned out to be a little more complicated than I was ready for.

Among other challenges, #exit may or may not be embedded in an #if build configuration test. Using #exit should allow all material up until that point to be compiled into the build while gracefully excluding all material after. I didn’t know how to check whether the directive was embedded in a condition and how to properly complete the condition (#endif) while discarding remaining text. It was, at my stage of the journey, a step too far.

I put my first attempt to the side and tried something else. I tried to push a scope using with(value) { }, so the material within the scope was native to value. That too proved too difficult without assistance although I am beginning to understand how Swift creates and manages its scope. It was a programming failure but a learning success.

Two projects abandoned, I knew I had to pick something very easy to work with. Although I would have loved to have picked up and run with Dave DeLong’s context pitch (which is discussed here) , I recognized that I needed to bite off something smaller first. So I decided to add a #dogcow token that produces the string value `”????????”` in source. How difficult could that be, right?

About five hours and edits to twenty-one files later, I had it working. Kind of. Because I ran into one of the many frequent headdesk situations that plague Swift compiler development. I had focused on my edits to the most recent repo without rebuilding the supporting tool suite.

Ninja Builds

A ninja build is a quick way to build just the compiler component. But at some point you can’t ninja your way into an entire working toolchain. I couldn’t test my changes until I rebuilt everything, a process that can take many many hours on my Mac:

% ./swift ~/Desktop/test.swift
:0: error: module file was created by an older version of the compiler; rebuild 'Swift' and try again: /Volumes/MusicAndData/BookWriting/LiveGithub/Apple/appleswift/build/Ninja-ReleaseAssert/swift-macosx-x86_64/lib/swift/macosx/x86_64/Swift.swiftmodule

Argh.

Building the compiler is not a quick thing. Even a ninja build is a non-trivial investment of time. So if you want to be completely honest, the total coding and build time a lot longer than just five hours. A lot longer.

Make sure you’ve updated your repo, incorporated the latest changes for swift and all of its support tools, and built them all before working on your branches. It will save you a lot of frustration.

Be aware that in the time it takes to create even a small project, you’ll probably be out of date with master. Hopefully this won’t affect your changes, and the files you think you’re patching are still the right files you should be patching.

Designing My Changes

The best way to add anything to the Swift compiler is to find some construct that has already been contributed and look through pull releases to discover what parts of the language they had touched.

That’s how I got started with my assertions/optimization revision. I looked at the recent canImport()pull, and targeted the seven files that involved. In the end, I only  needed to modify four files in total, excluding tests. It was a fairly “clean” and simple update.

To add #dogcow, again excluding tests, I had to change nearly two dozen files, most of them written in C++, a few using Python and Swift’s own gyb (aka “generate your boilerplate”) macros.

I’ve put up a gist that details my notes as I performed file edits. (I did have to make some further changes once I started testing.) Each group consists of a file name followed by my changes, with some context around them to make it easier to return to those parts of the file.

That’s followed by any relevant grep results that encouraged me to edit the file in question plus error messages from the compiler, of which there were a few, as I made several typos and forgot at times to add semicolons to the ends of lines. (Damn you semicolons! shakes fist) I put ### markers near the errors to make them easier to find in the notes.

As you walk through my notes, you’ll notice that I had to create a token (pound_dogcow), which is a kind of MagicIdentifierLiteralExpr expression. By inserting a simple token without arguments and returning a string, I cut down on my need to parse additional components or produce a complicated return value.

(Sorry Dave! I’ll get there I hope… After all, I know where each of the five components of Context live: file, line, column, function, and dsohandle. I just don’t know how to build and export the struct so that it gets put into place and can be consumed by the Swift user.)

As a string, my #dogcow can be used as a literal, so I conformed it to KnownProtocolKind::ExpressibleByStringLiteral. It needed to be serialized and deserialized, emit its custom string, support code completion, and more. Scrolling down my file, you’ll see the scope of notes including searches, comments, and edits for this one change.

Debugging

One of the most interesting things that happened during this exercise was when I made an actual logic error, not a syntax error, so Swift compiled but my program failed:

Assertion failed: (isString() && "Magic identifier literal has non-string encoding"), function setStringEncoding, file /Users/ericasadun/github/Apple/appleswift/swift/include/swift/AST/Expr.h, line 1052.

For the longest time I was convinced (wrongly) that because I was using Unicode, that I had somehow screwed up the string encoding. This was actually a coding mistake, an actual bug, and had nothing to do with the way my string was built nor the fact that I used emojis. It took a while to track down because my head was in the wrong place.

Notice I have DogCow returning true here. I accidentally swapped the two lines so it was originally returning false, falling into the Line/Column/DSOHandle case.

 bool isString() const {
   switch (getKind()) {
    case File:
    case Function:
    case DogCow: // it's a string!
      return true;
    // it should not have been down here
    case Line:
    case Column:
    case DSOHandle:
      return false;
  }
  llvm_unreachable("bad Kind");

Proof-of-Concept

Once compiled, I used a few simple calls to test my work. Here’s the source code I used. I accidentally added an extra space in the assignment test. You can see in the screenshot as well:

// String interpolation and default argument
func hello(_ greetedParty: String = #dogcow) {
    print("Hello \(greetedParty)")
}

hello()
hello("there")

// Use in assignment
let aDogCow = #dogcow
print("The value is ", aDogCow)

// Use directly in statement
print(#dogcow)

Lessons Learned

Having built a working version of the compiler incorporating my solo changes, no matter how trivial and yes it was extremely trivial, has been a big confidence builder. Exploring the process from consuming tokens to emitting intermediate language representations has enlightening.

  • I learned to update everything and build from scratch before starting my work. Because if you don’t, you’ll end up doing it later and wasting that time.
  • I learned how to track down similar modifications and use them as a template for exploring what parts of the compiler each change touched.
  • I learned that some errors would not be in the compilation but in the testing, as one tends to forget things like “just because it built doesn’t mean it will compile correctly” when one is very very focused on getting things to run and extremely new to the process.

I have now worked on two (technically three) compiler modification projects. Each has  taught me something new. If you’d like, take a peek at some explorations I’ve pushed to my forked repo:

The DogCow changes are clean, in the style of something that I might actually do a pull request for. The optimization checks are not. They retain all my little in-line notes I use for searching through text files to find what I’ve changed.

The early debug checks represent the time before I could get all the compiler tools built on my system. I was basically programming in my head at that point, guessing what would work or not, before the conversation on Swift forums moved me to my current design.

My guesswork was wrong. I focused on using a trio of built-in functions (like _isDebugAssertConfiguration) mentioned on-list. This turned out to be a non-viable solution. I needed to follow the example set by canImport to set my flags.

Finally, a word to the wise: Don’t ./utils/build-script -R in one terminal window and ninja swift in another at the same time. Oops.

Cleaning up doc comments for formatted commits

I’m working on a proposal to introduce CountedSet, cousin to NSCountedSet, to Swift. This kind of type involves a massive amount of doc comment content. I decided to adapt the comments from Cocoa Foundation (for NSCountedSet) and Swift Foundation (for Set) as part of my coding and quickly found how ridiculous it was to do this by hand.

At first I tried to write an Xcode “reflow doc comments” extension but as I found in the past, Xcode extensions are a dreadful pain to program and debug. It really wasn’t worth doing this (although it would be my preferred workflow for use) in terms of spending my time well.

Instead, I decided to create a simple playground. I’d paste my Swift file into a known Resources file (in this case, test.swift, although I’m sure you can come up with a better name than that if you use this). I’d process the text with a simple playground implementation and print to stdout.

It was an interesting problem to solve and one that took slightly longer than I anticipated. It’s also one that’s only partially complete. The log jams involved looking ahead at the next line to decide when each blob of text was complete so it could be reflowed, preserving paragraph breaks in the comments, respecting code blocks, and leaving any in-file code intact. Reflowing the words was much easier. I’m sure you’ve written that part of it in any number of algorithms and intro-language classes.

The parts I didn’t tackle were the special formatting required for doc comment keywords, like - Parameter, - Returns, - SeeAlso, and so forth. The associated lines for these items must be reflowed with proper indentation so the Quick Help parser can properly parse them. I leave that for another day because they are relatively minor work compared to reflowing long and complex doc comments as a whole.

I’ve put my code up on Github if you want to offer improvements, fixes, or feedback:

 

Pattern match style filtering

I’ve written about this before, but a question came up recently that I thought was worth posting, as it’s a much simpler case than the one I wrote about last year.

Byrre_b asks:

Is there any way to write “pattern matching style filtering” in a better way then using a complete `if case` statement?

Such as:

let values: [NonEquatableEnum] = [...]
let filtered = values.filter { val in
    if case .thatOneInterestingValue = val {
        return true
    }
    return false
}

Note: Several people have pointed out if the enumeration is equatable, just use == rather than pattern matching. You can match case, even with associated values with, e.g. if case .foo = value.

You can filter using the pattern match operator, as shown here, or for equatable enumerations with ==.

enum NonEquatableEnum { case nah, blah, thatOneInterestingValue }

let values: [NonEquatableEnum] = [.nah, .blah, .nah, .thatOneInterestingValue, .nah, .thatOneInterestingValue, .blah]
let filtered = values.filter({ $0 ~= .thatOneInterestingValue })

Although this stores all values matching your subject case into filtered, the results aren’t very meaningful unless you want to count how many instances of .thatOneInterestingValue appear. That’s because filtering by enumeration case is usually limited to two situations:

  • You’re working with a structure and using the enumeration as a tag for filtering
  • You’re working with associated values and want to collect the enumeration cases and then extract the values.

The first of these is made simple with Swift 4 key paths. For example, consider the following structure:

struct Foo {
    var (x, y, z) = (0, 0, 0)
    let numnum: NonEquatableEnum
    init(_ n: NonEquatableEnum) { self.numnum = n }
}

let values2: [Foo] = [Foo(.nah), Foo(.blah), Foo(.nah), Foo(.thatOneInterestingValue), Foo(.nah), Foo(.thatOneInterestingValue), Foo(.blah)]

Assuming each instance has some more interesting data than the default (0, 0, 0) triple, pull out tagged instances using the same filter approach:

let kp = \Foo.numnum
let filtered2 = values2.filter({ $0[keyPath: kp] ~= .thatOneInterestingValue })

The key path lets you “dive” into each struct to test the enumeration member, while preserving the data stored in the other structure members. Instead of just counting how many instances of a simple enumeration there are, it acts as a meaningful filtering operation.

The second challenge, retrieving associated values, is more complex, as explained in my original write-up. Hand-crafting a result with if case gets you the values you need.

enum MoreComplicated {
    case one(Int)
    case two(Int)
    case three(String, String)
}

let values3: [MoreComplicated] = [.one(3), .two(5), .two(2), .three("hi", "there")]

Here’s an example that pulls out the case three enumerations:

let results2 = values3.filter({
    if case .three = $0 { return true } else { return false }
})

If you want to filter and extract at the same time,  add let declarations into your if case statement and switch the filter operation to a flatMap :

let results3 = values3.flatMap({
    (value: MoreComplicated) -> (String, String)? in 
        guard case .three(let x, let y) = value
            else { return nil }
        return (x, y)
    })

This returns an array of tuples, containing the associated values for each matching enumeration case.

Thoughts? Improvements? Fixes? Drop a note, tweet, or email to let me know!

How to check your security update

A macOS Security flaw opened access to users who didn’t have root passwords. So Apple updated computers overnight

Unfortunately Security Update 2017-001 turned out to bork file sharing, so Apple updated the problem both by issuing repair instructions and updating the patch.

To check whether you have the proper build, choose Apple Menu () > About This Mac. Click the System Report button and scroll down to Software. Click the word Software. You should be running 17B1003.

Thanks everyone.

p.s. Esopus Spitzenburg is my Mac mini. My MBP is Broxwood Foxwhelp. And yes, I’ve long since gone past Fuji, Gala, Rome, Honeycrisp, Pippin, Winter Banana, and many other varietals.

The Perfect QA Recruitment Filter

Have you ever heard of the “Brown M&M” clause? The band Van Halen used to issue a contract rider for its shows. In it, they requested a supply of M&Ms for backstage but specifically excluded any brown ones. Van Halen reserved the right to cancel the show if any  brown M&Ms were found.

Superficially, this may sound like a particularly obnoxious and entitled rock star request. However, there was a deeper motivation for this contract stipulation. As articles in recent years have revealed, Van Halen’s “no brown M&Ms” clause acted as an early warning system that alerted the band about potentially unsafe venue conditions.

Steve Jones of Entrepreneur writes:

In now-departed arenas such as Toronto’s Maple Leaf Gardens, the original Boston Garden and Chicago Stadium, Van Halen was loading in massive amounts of staging, sound equipment and lighting. Unfortunately, these buildings were never built to accommodate a rock band of Van Halen’s scope. Without specific guidelines, old floors could buckle and collapse, beams could rupture, and the lives of the band, their crew and fans could be at serious risk.

To ensure the promoter had read every single word in the contract, the band created the “no brown M&M’s” clause. It was a canary in a coalmine to indicate that the promoter may have not paid attention to other more important parts of the rider, and that there could be other bigger problems at hand.

Whenever the band found brown M&M’s candies backstage, they immediately did a complete line check, inspecting every aspect of the sound, lighting and stage setup to make sure it was perfect.

This kind of smart business check isn’t limited to large-scale traveling productions. JF Poole of Primate Labs was telling me the other evening about a similar approach he uses for recruiting Quality Assurance engineers.

“What I love,” he said, “is that pretty much every cover letter we’ve received for the position has cited the candidate’s ‘great attention to detail’ but almost none of them include the candidate’s favorite primate.”

Odd detail, right? But Primate Labs specifically asks for that as part of their recruitment process. The job listing says, “Please, mention your favorite primate in your cover letter.” For a position whose foundation is careful adherence to detail through every stage of production, it’s the perfect test.

Like Van Halen’s brown M&M’s, the recruiters at Primate can quickly scan incoming applications for one unique signifier. Even better, that request tests a candidate’s intrinsic suitability for the position: a rigid and fanatical adherence to detail. When an applicant doesn’t pay attention to the job listing, they probably won’t pay proper attention to your software. It’s a genius approach.

In some cases, sophisticated tells aren’t exactly needed (for example,  “I, $NAME (sic), have come across an opportunity for the position of Software QA Analyst for your esteemed company.“) but it’s helpful to adopt a quick indicator, allowing HR to set aside resumes for more serious consideration.

When I mentioned how sad I was that I couldn’t write up a post about this, John assured me that it would be okay. (“You’re overestimating the set of people who would a. read your blog, and b. apply for our job. Go for it.”) Can you think of any other job category that can so easily hide stealth “tells” for qualifications outside of, maybe, “profreader” (sic) and other consistency-driven positions?

For what it matters, my favorite primate (outside of my husband and kids) is the Slow Loris. Isn’t it cute?

(image via International Animal Rescue)