Archive for the ‘Swift’ Category

The startling uniquing of Swift 4 dictionaries

As you’ve probably heard, Swift 4 now has multiline strings. Rejoice! And thank John Holdsworth. For now you can do stuff like this:

let xml = """
    <?xml version="1.0"?>
    <catalog>
    <book id="bk101" empty="">
        <author>\\(author)</author>
        <title>XML Developer's Guide</title>
        <genre>Computer</genre>
        <price>44.95</price>
        <publish_date>2000-10-01</publish_date>
        <description>An in-depth look at creating applications with XML.</description>
    </book>
    </catalog>
"""

It’s super handy, allowing you to incorporate newline and individual " characters without having to escape them. (You do have to escape the backslash, as in the preceding example).

One of the things you might want to do with a big hefty string is to count the number of words, and maybe find out which word occurs the most. So here’s another multi-line string, one pulled from a lorem ipsum generator:

let lipsum = """
    Lorem ipsum dolor sit amet, consectetur adipiscing elit. Curabitur vitae hendrerit orci. Suspendisse porta ante sed commodo tincidunt.

    Etiam vitae nunc est. Vestibulum et molestie tortor. Ut nec cursus ipsum, id euismod diam. Sed quis imperdiet neque.

    Mauris sit amet sem mattis, egestas ligula ac, fringilla ligula. Nam nec eros posuere, rhoncus neque ut, varius massa.
    """

This particular example occupies 5 lines and includes a lot of text and punctuation. Because you can now treat Strings as collections, you can do stuff like this:

let w = "Hello".filter({ $0 != "l" }) // "Heo"

Similarly, you can use character set membership to select only letters and spaces:

let desiredCharacters = CharacterSet.letters
    .union(CharacterSet(charactersIn: " "))
let workString = lipsum.filter({ character in
    let uniScalars = character.unicodeScalars
    return desiredCharacters
        .contains(uniScalars[uniScalars.startIndex])
})

Unfortunately, Character and CharacterSet are still struggling a bit to get along with each other, which is why I’m doing that nonsense with the unicodeScalars.  Anyway, this gives you a single line string with just letters and spaces, so you can then break the string into words.

// Split along spaces
let words = workString.split(separator: " ")

Dictionary now has a feature that allows you to recognize you’re overwriting an existing key and apply a function to a key’s value each time the key is added. It’s called uniquing, and it lets you do neat things like count the number of times a token appears in a sequence:

// Add to dictionary, with "uniquing"
let baseCounts = zip(words.map(String.init), repeatElement(1, count: .max))
let wordCounts = Dictionary(baseCounts, uniquingKeysWith: +)

This code creates an infinite sequence of the number 1, and applies addition each time a duplicate key is found. You get exactly the same results by applying + 1 closure, although this is uglier and a little wasteful:

let wordCounts = Dictionary(baseCounts, 
    uniquingKeysWith: { (old, _) in old + 1 })

You can then find the word that appears the most

// Find the word that appears most often
var (maxword, maxcount) = ("UNDEFINED", Int.min)
for (word, count) in wordCounts {
    if count > maxcount { (maxword, maxcount) = (word, count) }
}
print("\(maxword) appears \(maxcount) times")
// et appears 8 times (at least it did 
// in my much longer text)

You can use uniqueKeysWithValues to fill up a dictionary by zipping two sequences:

let letterOrders = Dictionary(uniqueKeysWithValues: zip("ABCDEFGHIJKLMNOPQRSTUVWXYZ", 1...))
print(letterOrders)
// ["H": 8, "X": 24, "D": 4, "J": 10, "I": 9, "M": 13, "Z": 26,
//  "S": 19, "A": 1, "C": 3, "N": 14, "Y": 25, "R": 18, "G": 7, 
//  "E": 5, "V": 22, "U": 21, "L": 12, "B": 2, "K": 11, "F": 6, 
//  "O": 15, "W": 23, "T": 20, "P": 16, "Q": 17]

Another thing you might do with updated dictionaries is to build a set or array out of sequence values. This next example collects values for each key:

let invitedFriends: [(String, String)] = [
    ("Rizwan", "John"), ("Rizwan", "Abe"),
    ("Soroush", "Dave"), ("Joe", "Dave"), 
    ("Soroush", "Zev"), ("Soroush", "Erica")]
let invitationLists = Dictionary(
    invitedFriends.map({ ($0.0, [$0.1]) }),
    uniquingKeysWith: { (old: [String], new: [String]) in
        return old + new }
)
print(invitationLists)
// ["Rizwan": ["John", "Abe"], "Soroush": ["Dave", "Zev", "Erica"], "Joe": ["Dave"]]

You can store a tuple of the maximum and minimum values found for each unique key. The value structure has to be established in the initial streams, which can be ugly:

// Create 100 random numbers
let hundredRandom: [(Int, Int)] = (1...100).map({ _ in let value = Int(arc4random_uniform(10000)); return (value, value) })

// Create ten sequences of 1 through 10
let tens = sequence(state: 1, next: { (value: inout Int) -> Int in
    value += 1; return (value % 10) + 1
})

// Build the two together
let values = zip(tens, hundredRandom)
let extremes = Dictionary(values, uniquingKeysWith: { (old: (Int, Int), new: (Int, Int)) in
    return (min(old.0, new.0), max(old.1, new.1))
})
print(extremes)
// [10: (504, 8342), 2: (770, 8874), 4: (164, 7871), 9: (177, 8903), 
//  5: (1707, 9627), 6: (577, 8318), 7: (174, 8818), 3: (2837, 9198),
//  8: (3573, 9432), 1: (474, 8652)]

I probably could have made this a little more elegant but I was running out of time because I had to pick up my kids. If you have improvements for the last few examples, let me know. Sorry about the rush.

p.s. Thanks for the tip about using unicodeScalars on char.

Holy War: Mutable copies

Applying mutableCopy() to an NSObject returns Any, not the version of the type you’re attempting to make mutable, for example, NSMutableArray, NSMutableParagraphStyle, NSMutableAttributedString or whatever.

Nate asks:

Is is acceptable to use as! with mutableCopy() or is there a better way to do this?

// Approach 1: Forced unwrap
let mutableStyle1 = style.mutableCopy() as! NSMutableParagraphStyle

The forced as! cast used in this approach will always succeed (even if using as! makes you want to wash your hands afterwards). But there are other approaches to consider. What do you think of these alternative takes on the question? Here are some other solutions for you to weigh in on.

// Approach 2: Forced unwrap with explanation on failure

/// Very low precedence group
precedencegroup VeryLowPrecedence { lowerThan: FunctionArrowPrecedence }

infix operator !!: VeryLowPrecedence

/// Guaranteed safe unwrap or fatal error with custom error string
public func !! <Wrapped>(value: Wrapped?, complaint: String) -> Wrapped {
    guard let value = value
        else { fatalError(complaint) }
    return value
}

let mutableStyle2 = style.mutableCopy() as? NSMutableParagraphStyle !! "Guaranteed cast to mutable paragraph style failed"

// Approach 3: Guard with explanatory fatal error
guard let mutableStyle3 = style.mutableCopy() as? NSMutableParagraphStyle
    else { fatalError("Guaranteed cast to mutable paragraph style failed") }

// Approach 4: Create then set with current attributes
let mutableStyle4 = NSMutableParagraphStyle()
mutableStyle4.setParagraphStyle(style)

// Approach 5: Protocol to expose typed mutable version
public protocol AvailableMutatingVersion: NSObjectProtocol {
    associatedtype MutableForm
    func mutableCopy() -> Any
    var mutableVersion : MutableForm { get }
}

extension AvailableMutatingVersion {
    public var mutableVersion: MutableForm {
        guard let copy = self.mutableCopy() as? MutableForm
            else { fatalError("Guaranteed mutable copy could not be constructed") }
        return copy
    }
}

extension NSParagraphStyle: AvailableMutatingVersion {
    public typealias MutableForm = NSMutableParagraphStyle
}

let mutableStyle5 = style.mutableVersion

Which approach reigns supreme? Vote now or offer some alternatives…

The surprising awesomeness of Grouped Dictionaries

Yesterday, I was chatting about ways to partition a stream of values. I wanted to collect values into new streams: values that satisfied a predicate, and those that did not. A number of hugely complicated approaches were discussed until Nate Cook brought up a fantastic new Swift 4.0 API. The Dictionary type’s init(grouping:by:) call allows you to convert any sequence to a dictionary by grouping its elements.

Pass the initializer a sequence and a closure, and the initializer creates entries for each value returned by the closure. For a predicate, you end up with two groups: one populated for true predicate values, one for false:

let numbers = 1 ... 20
let predicate: (Int) -> Bool = {  $0 % 2 == 0 }
let grouping = Dictionary(grouping: numbers, by: predicate)
print(grouping)
// [false: [1, 3, 5, 7, 9, 11, 13, 15, 17, 19], 
//  true: [2, 4, 6, 8, 10, 12, 14, 16, 18, 20]]

Once partitioned, you can pull values from each collection (true and false) and operate on the members of that particular group:

// Iterate through the even members of this sequence
for number in (grouping[true, default:[]]) { ... }

This example uses Swift’s new default value in its grouping look-up:

grouping[true, default:[]]

If you haven’t started using this new feature, you should really adopt it into your work flow. It’s wonderful. With this call, a dictionary returns the default value when a key is not found. This avoids forced unwraps (dictionary lookups normally return optionals) and acts as an alias for nil coalescing. In this example, the default call is an alias for (grouping[true] ?? []).

Dictionary grouping provides a solid solution for sequence partitioning by predicates. But you can also do a lot more with this API. Let me give you a bunch of examples that showcase the power of this one little call.

This example creates a dictionary of names grouped by first letter. Swift creates an entry for each unique capitalized letter it finds within the name collection:

let names = ["bob", "Amelia", "joe", "alice", "jane"].map({ $0.capitalized })
let nameDict = Dictionary(grouping: names.filter({ !$0.isEmpty })) {
    $0.prefix(1)
}
print(nameDict)
// ["J": ["Joe", "Jane"], "B": ["Bob"], "A": ["Amelia", "Alice"]]

You could easily expand this example to disregard diacritical marks by stripping them through a StringTransform. (This approach is left as an exercise for the reader. ???? )

In more realistic text-based grouping, the information you want to group on is often a level or two down within a structure. Swift keypaths make it easy to access the information you need for grouping. This next example constructs a keypath to a contact’s last name, and uses that keypath to provide the partition keys for the dictionary.

/// A person
struct Person {
    let firstName: String
    let lastName: String
}

/// A contact entry
struct Contact: CustomStringConvertible {
    let name: Person
    let address: String
    var description: String { return "\(name.firstName) \(name.lastName) at \(address)" }
}

/// Construct some contacts
let lasts = ["smith", "jones", "simpson", "cheese", "putty"]
let contacts: [Contact] = zip(names, lasts)
    .map({ Person(firstName: $0.0.capitalized, lastName: $0.1.capitalized) })
    .map({ Contact(name: $0, address: "1 Main St" )})

// Establish keypath to the last name field
let lastNameKeypath = \Contact.name.lastName

// Construct the address book based on the first letter
// of the contact's last name, then print the
// address book
let addressBook = Dictionary(grouping: contacts) { 
    $0[keyPath: lastNameKeypath].prefix(1).uppercased() 
}
for (key, value) in addressBook { 
    print(key, value) 
}

// J [Amelia Jones at 1 Main St]
// C [Alice Cheese at 1 Main St]
// P [Jane Putty at 1 Main St]
// S [Bob Smith at 1 Main St, Joe Simpson at 1 Main St]

Grouped dictionaries aren’t limited to strings and numbers. They can be quite helpful when working with interface elements. For example, in a complex user-managed  presentation, you might group button views based on their control state:

// buttonArray is [NSButton]
let buttons = Dictionary(grouping: buttonArray, by: { $0.state })

Although all the examples so far have used sequence data to produce the keys used in grouping, those keys needn’t be pulled directly from the values they categorize. Enumerations make a great Swift choice for categorizing data. This example revisits the even/odd grouping shown in the first example of this write-up but replaces the true/false predicate values with a Parity enumeration.

enum Parity {
    case even, odd
    init(_ value: Int) {
        self = value % 2 == 0 ? .even : .odd
    }
}
let parity = Dictionary(grouping: 0 ..< 10 , by: Parity.init )

You can use this same approach with more extensive enumerations and more complicated data. That said, you can take exactly this example, and simplify it enormously by grouping numbers based on a simple function. For example, you can use the direct results of the modulo operator (which returns 1 and 0)  as the keys to your grouped dictionary:

let parity2 = Dictionary(grouping: 0 ..< 10) { $0 % 2 }

In the end, it’s up to you on how you want to split your sequence, and the keys you want to represent the subsequences derived from that split. Hopefully this post has given you a few ideas to inspire your own partitioning schemes.

If you like my write-ups, please consider buying a book. Thanks to everyone who contributed to the discussion about partitioning sequences, especially Nate Cook, Tim Vermeulen, Kabir Oberai, Daniel Jalkut, Tom Harrington, Soroush Khanlou, Paul Cantrell, and Zachary Drayer.

Working with text output streams

Here’s another great write-up from Tim V. 

By far the most common way to add a custom textual representation to a type is by implementing CustomStringConvertible. Consider this type:

struct Person {
    let name: String
    let age: Int
    let spouse: String?
    let children: [String]
}

A pretty standard way to conform to CustomStringConvertible would be this:

extension Person: CustomStringConvertible {
    var description: String {
        var description = ""
        
        description += "Name: \(name)\n"
        description += "Age: \(age)\n"
        
        if let spouse = spouse {
            description += "Spouse: \(spouse)\n"
        }
        
        if !children.isEmpty {
            description += "Children: \(children.joined(separator: ", "))\n"
        }
        
        return description
    }
}

This code doesn’t look too bad, but it’s a pain to write var description = "" and return description over and over, if this is a pattern you commonly use. It’s also quite easy to forget to add \n to each line.

The relatively unknown standard library protocol TextOutputStreamable solves both of these problems for you. Rather than adding a description computed property, all you have to do is write your properties to a TextOutputStream instance:

extension Person: TextOutputStreamable {
    func write<Target: TextOutputStream>(to target: inout Target) {
        print("Name:", name, to: &target)
        print("Age:", age, to: &target)
        
        if let spouse = spouse {
            print("Spouse:", spouse, to: &target)
        }
        
        if !children.isEmpty {
            print("Children:", children.joined(separator: ", "), to: &target)
        }
    }
}

That’s it! Whenever something that conforms to TextOutputStreamable but not to CustomStringConvertible is turned into a string, the write(to:) method we just implemented is used:

let person = Person(name: "Michael", age: 45, spouse: "Emma", children: ["Charlotte", "Jacob"])
print(person)

>Name: Michael
>Age: 45
>Spouse: Emma
>Children: Charlotte, Jacob

If you enjoyed this write-up, you might be interested in an old post I wrote about using streams to transform data.

Initializing Typed Arrays

Foundation’s URLQueryItem is just a stringly-typed key-value pair. You create one with a name and value:

public init(name: String, value: String?)

Since Swift supports literal initialization, you’d think you could use a dictionary to set up a [URLQueryItem] array, right? Well, yes and no.

You can’t just conform Array where Element == URLQueryItem to ExpressibleByDictionaryLiteral. Array extensions with constraints cannot have inheritance clauses. There are several ways around this limitation.

First (and best), you can just map an initializer across a dictionary literal:

let result = ["key1": "value1", "key2": "value2", "key3": "value3"]
    .map({ URLQueryItem(name: $0.key, value: $0.value) })

Second, you could make URLQueryItem itself dictionary initializable, but that starts to get ugly:

extension URLQueryItem: ExpressibleByDictionaryLiteral {
    public typealias Key = String
    public typealias Value = String
    public init(dictionaryLiteral elements: (String, String)...) {
        guard elements.count == 1
            else { fatalError("URLQueryItem requires single key-value pair") }
        self.init(name: elements.last!.0, value: elements.last!.1)
    }
}

let uq: URLQueryItem = ["key": "value"] // okay
let uqs: [URLQueryItem] = [["key1": "value1"], ["key2": "value2"], ["key3": "value3"]] // bleh

Third, you could use some kind of intermediate type to produce a URL query item array using Swift shortcuts. For example, you can set up a struct that builds the query item array and then pull from there:

struct URLQueryItems: ExpressibleByDictionaryLiteral {
    public typealias Key = String
    public typealias Value = String
    let items: [URLQueryItem]
    public init(dictionaryLiteral elements: (String, String)...) {
        items = elements.map({ URLQueryItem(name: $0.0, value: $0.1) })
    }
}

let uqis: URLQueryItems = ["key1": "value1", "key2": "value2", "key3": "value3"]
uqis.items

But again, it’s really not an improvement on using a mapped dictionary.

In the best of all worlds, which doesn’t exist, you’d be able to do something like this, but I don’t think there’s a way to accomplish this in modern Swift. Solution 2 is about as close as you get.

myRequest.queryItems = ["key1": "value1", "key2": "value2", "key3": "value3"]

Is there something I’m overlooking? If so, let me know. Drop a comment, mail, or tweet. Thanks.

Update:

and

The problem with Swift Playground localization

Starting in Swift Playgrounds 2, you can now use localized strings to guide the narration of your interactive lessons. As the screenshot above demonstrates, you can used localizable markup to provide the most appropriate text for titles, introductory text, and feedback.

However, what you can’t do is localize Swift members. Your French and Chinese consumers must tell Byte to moveForward(), not avancer() or 向前移动().

One of the guiding principles of the Swift language is demonstrated in its embrace of unicode for identifier symbols. This approach accommodates programmers and programming styles from many languages and cultures.

Xcode 9 has introduced major advances in code refactoring. It seems an obvious win to allow that technology to be applied to Swift Playgrounds 2, enabling identifier localization.

That’s because identifiers play such a key role in Swift Playgrounds. Unlike standard development tasks, where it’s unnecessary to create idiomatic APIs like IUContrôleurDeNavigation, the point of Swift Playgrounds is to teach and instruct. It uses small, limited, controlled API exposure, nearly all custom and supporting of the teaching story.

The anthropomorphized Byte character acts as a stand-in for the learner coder. And in doing so, it should communicate with commands that this coder identifies with, turnLeft and moveForward, not incomprehensibleForeignPhrase1 and evenMoreConfusingForeignPhrase2.

I think this is an opportunity waiting to happen, and I can’t imagine it would be all that hard to implement given the expansive identifier suite and the limited API visibility presented in a typical playgroundBook.

What do you think? Is it too much to ask for a localizable.Members.plist?

State of the Swift PM from the Swift PM PM

From Rick Ballard, Swift Package Manager release manager, a status update with regard to Swift 4. Here’s an overview of the proposals that have recently been incorporated into the system:

We’ve implemented a number of evolution proposal[s] this Spring:

• SE-0152 [ https://github.com/apple/swift-evolution/blob/master/proposals/0152-package-manager-tools-version.md ] introduced a “tools version”, stored in a comment at the top of the Package.swift manifest, which allows you to declare what version of the Swift tools your package requires, and to avoid resolving package dependencies against package versions which require newer tools than those you are using. The tools version is also used to control whether to enable new features such as the revised package manifest API, and to determine which Swift language version is used to interpret the manifest.

• SE-0158 [ https://github.com/apple/swift-evolution/blob/master/proposals/0158-package-manager-manifest-api-redesign.md ] redesigned the Package manifest API, adopting the Swift API design guidelines and cleaning up some problems in our API. Existing packages can continue to use the old Package manifest API until they update their Swift tools version.

• SE-0151 [ https://github.com/apple/swift-evolution/blob/master/proposals/0151-package-manager-swift-language-compatibility-version.md ] introduced a property to control which Swift language version should be used to compile a package’s sources. If this property is not set, the default is determined by the package’s Swift tools version.

• SE-0146 [ https://github.com/apple/swift-evolution/blob/master/proposals/0146-package-manager-product-definitions.md ] added explicit control over what products a package vends to clients, and what targets are used to build each product. Packages using the new manifest API must declare any products which they wish their package to vend.

• SE-0175 [ https://github.com/apple/swift-evolution/blob/master/proposals/0175-package-manager-revised-dependency-resolution.md ] removed the conditional “pinning” feature and replaced it with a simpler “resolved package versions” file (“Package.resolved”), along with improvements to SwiftPM’s dependency resolution behaviors.

• SE-0150 [ https://github.com/apple/swift-evolution/blob/master/proposals/0150-package-manager-branch-support.md ] added support for packages which depend on a branch, rather than a tagged version, of other packages. This is especially useful during bringup of a new package, and in-between tagged releases. In order to enforce stability for tagged versions, a tagged package version must only depend on other packages’ tagged versions, not on branches.

• SE-0162 [ https://github.com/apple/swift-evolution/blob/master/proposals/0162-package-manager-custom-target-layouts.md ] added API for controlling where the source files for each target should be found. This allows SwiftPM to support source trees that don’t conform to the standard package layout conventions.

• SE-0149 [ https://github.com/apple/swift-evolution/blob/master/proposals/0149-package-manager-top-of-tree.md ] added support for a “–path” flag to `swift package edit`, allowing users to take full control over an edited dependency and share it between multiple top-level packages.

In addition to these proposals, we’ve made other significant improvements:

• On macOS, package manifest interpretation and the package build are now sandboxed, to help mitigate the effects of maliciously crafted manifests.

• Many error messages and diagnostics have been improved, including information about conflicts during dependency resolution.

• Xcode project generation has been improved, and now allows schemes to reference package targets across regenerations of the project.

• There have been a host of smaller improvements and bugfixes.

All these changes are available in the Xcode 9 beta released today, and in the Swift 4.0 Development snapshots available at https://swift.org/download/#snapshots.

Xcode 9 lays the groundwork for first-class, native support in Xcode for Swift packages with the preview version of its new build system. This build system provides the flexibility and extensibility needed to allow Xcode to support new build models, such as Swift packages. Additionally, considerable work has gone into the SwiftPM library vended by the SwiftPM project, which will support integrating Swift packages into tools like Xcode.

So what’s left in SwiftPM 4? First, we will be implementing SE-0179 [ https://github.com/apple/swift-evolution/blob/master/proposals/0179-swift-run-command.md ], support for a `swift package run` command. Beyond that, we expect to start winding down this release and looking ahead to the next, though we are still open to suggestions or evolution proposals for modest features and fixes.

There are a few features that we originally considered for SwiftPM 4 which are unlikely to be included in this release at this point: performance test support, support for configuration files, support for repositories which contain more than one package, and build settings support. With the possible exception of configuration files, these are likely to be a high priority for the next release. In particular, the core team has done work on a design for build settings which we expect to invite comment and discussion on early in the next release; this is a fairly consequential feature, and we want to make sure to get it right. Since that feature is not landing in SwiftPM 4, we are considering adding some package properties in SwiftPM 4 that will help alleviate some of the biggest pain points here, such as a C++ language version property.

Other features we will likely consider for the next release cycle include support for package resources (such as image assets), license and metadata support, explicit support for handling source control forking, and a generic mechanism for invoking build tools that the package manager doesn’t natively support. Finally, we do anticipate supporting a centralized package index at some point in the future, and we may begin laying the groundwork for that in the upcoming year.

As always, we appreciate the support, feedback, contributions, and adoption we’ve gotten from the package manager community. We’re looking forward to working with you all over the upcoming year to make SwiftPM even better.

Apple open sources key file-level transformation Xcode components

Ted Kremenek writes on the swift-dev list:

This afternoon at WWDC we announced a new refactoring feature in Xcode 9 that supports Swift, C, Objective-C, and C++.  We also announced we will be open sourcing the key parts of the engine that support file-level transformations, as well as the compiler pieces for the new index-while-building feature in Xcode.

We will be releasing the sources in stages, likely over the next few weeks:

– For the refactoring support for Swift, there are some cleanups we’d like to do as well as some documentation we’d like to author before we push these sources back.  Argyrios Kyrtzidis and his team from Apple will be handling that effort.

– For the refactoring support for C/C++/Objective-C, these are changes we’d like to work with the LLVM community to upstream to the LLVM project.  These will likely be first staged to the swift-clang repository on GitHub, but that is not their intended final destination.  Duncan Exon Smith and his team from Apple will be handling that effort.

– We’ll also be open sourcing the compiler support for indexing-while-building, which include changes to both Clang and Swift.   Argyrios and his team will be driving that effort.  For the clang changes they will likely be first staged to swift-clang, and then discussed with the LLVM community to upstream them to mainline Clang.

– Finally, we will be open sourcing the remaining pieces of the Swift migrator.  Argyrios and his team will be handling the push back of changes there, and those changes will only be impacting the swift repository.

As usually, we’ll also be pushing back changes to have Swift work with the latest Apple SDKs.  We’re expecting that push back to happen early next week.  When that happens we will temporarily lock commit access to the repositories.  Details about that will be sent out later in a later email.  Until then, the downloadable toolchains from Swift.org will continue to work with Xcode 8.3.2.  After we do the push back the downloadable toolchains will be moved to be baselined on the Xcode 9.0 betas.  This shift is necessary as changes to the overlays depend on the latest SDKs.

Writing a binary search tree

Tim discusses using Swift enumeration indirect keywords to build binary tree nodes. Read on to learn more about how reference types and value semantics combine…

Today we’re writing a simple binary search tree in Swift. While binary search trees seem to be very powerful in theory, their performance is rather disappointing in practice. Other, more advanced tree-shaped data structures such as red-black trees and B-trees solve some of the practical problems that binary search trees have, and are subsequently much more useful in performance-critical code. Nevertheless, learning about binary search trees is the first step to getting more familiar with this class of data structures.

Probably the simplest way to represent a binary tree in Swift is by using an indirect enum, like so:

enum BinarySearchTree<Element: Comparable> {
    case empty
    indirect case node(left: BinarySearchTree, value: Element, right: BinarySearchTree)
}

While indirect enums use reference types under the hood, they have value semantics by default. So that’s nice.

From all the standard library’s protocols, SetAlgebra and BidirectionalCollection both seem a good fit for our BinarySearchTree type: SetAlgebra for inserting and removing elements, and BidirectionalCollection for traversing the tree (both forwards and backwards, hence the name). However, for the purpose of this post, we’ll stick to only a couple basic methods.

Let’s start with insertion. Because of the way enums with associated values work, it’s easiest to implement the insert method using an under-the-hood inserting method:

extension BinarySearchTree {
    mutating func insert(_ element: Element) {
        self = inserting(element)
    }
    
    private func inserting(_ element: Element) -> BinarySearchTree {
        switch self {
        case .empty:
            // the tree is empty, so inserting an element results in a tree containing only that element
            return .node(.empty, element, .empty)
            
        case .node(_, element, _):
            // the element is already present in the tree
            return self
            
        case let .node(left, value, right) where element < value:
            // the element should be inserted into the left subtree
            return .node(left.inserting(element), value, right)
            
        case let .node(left, value, right):
            // the element should be inserted into the right subtree
            return .node(left, value, right.inserting(element))
        }
    }
}

Now we can insert values into a tree, but we can’t read them in any way. So let’s add a contains method as well:

extension BinarySearchTree {
    func contains(_ element: Element) -> Bool {
        switch self {
        case .empty:
            // an empty tree obviously doesn't contain any elements!
            return false
            
        case .node(_, element, _):
            // the element is equal to this node's value
            return true
            
        case let .node(left, value, _) where element < value:
            // if the element is present in the tree, it must be in the left subtree
            return left.contains(element)
            
        case let .node(_, _, right):
            // if the element is present in the tree, it must be in the right subtree
            return right.contains(element)
        }
    }
}

Let’s try it out!

var tree = BinarySearchTree.empty

tree.contains(5) // => false
tree.insert(5)
tree.contains(5) // => true
tree.insert(3)
tree.contains(3) // => true
tree.contains(5) // => true

Looks good. We can’t actually remove elements, though, but that is a real pain to implement and it’s out of scope for this post.

Finally, to iterate over a binary search tree, we need to implement the IteratorProtocol protocol. We’ll use an in-order traversal algorithm:

extension BinarySearchTree: Sequence {
    func makeIterator() -> BinarySearchTreeIterator<Element> {
        return BinarySearchTreeIterator(self)
    }
}

struct BinarySearchTreeIterator<Element: Comparable>: IteratorProtocol {
    var node: BinarySearchTree
    var stack: [(Element, BinarySearchTree)] = []
    
    init(_ node: BinarySearchTree) {
        self.node = node
    }
    
    public mutating func next() -> Element? {
        while case let .node(left, value, right) = node {
            stack.append((value, right))
            node = left
        }
        
        guard let (element, node) = stack.popLast() else { return nil }
        
        self.node = node
        return element
    }
}

Now we can do all kinds of sequence-y stuff with trees, like so:

var tree = BinarySearchTree.empty
[5, 2, 4, 8, 3, 2].forEach { tree.insert($0) }

for element in tree {
    print(element, terminator: " ") // => 2 3 4 5 8
}

tree.reduce(0, +)                                  // => 22
tree.lazy.map(String.init).joined(separator: ", ") // => 2, 3, 4, 5, 8

And the list goes on.

That’s it for now! If you enjoy this kind of stuff, make sure to check out Károly Lőrentey’s brand new book Optimizing Collections in which he in much detail goes through implementing several data structures in Swift, focusing on performance. As of writing this post, it’s 25% off.