The startling uniquing of Swift 4 dictionaries

As you’ve probably heard, Swift 4 now has multiline strings. Rejoice! And thank John Holdsworth. For now you can do stuff like this:

let xml = """
    <?xml version="1.0"?>
    <book id="bk101" empty="">
        <title>XML Developer's Guide</title>
        <description>An in-depth look at creating applications with XML.</description>

It’s super handy, allowing you to incorporate newline and individual " characters without having to escape them. (You do have to escape the backslash, as in the preceding example).

One of the things you might want to do with a big hefty string is to count the number of words, and maybe find out which word occurs the most. So here’s another multi-line string, one pulled from a lorem ipsum generator:

let lipsum = """
    Lorem ipsum dolor sit amet, consectetur adipiscing elit. Curabitur vitae hendrerit orci. Suspendisse porta ante sed commodo tincidunt.

    Etiam vitae nunc est. Vestibulum et molestie tortor. Ut nec cursus ipsum, id euismod diam. Sed quis imperdiet neque.

    Mauris sit amet sem mattis, egestas ligula ac, fringilla ligula. Nam nec eros posuere, rhoncus neque ut, varius massa.

This particular example occupies 5 lines and includes a lot of text and punctuation. Because you can now treat Strings as collections, you can do stuff like this:

let w = "Hello".filter({ $0 != "l" }) // "Heo"

Similarly, you can use character set membership to select only letters and spaces:

let desiredCharacters = CharacterSet.letters
    .union(CharacterSet(charactersIn: " "))
let workString = lipsum.filter({ character in
    let uniScalars = character.unicodeScalars
    return desiredCharacters

Unfortunately, Character and CharacterSet are still struggling a bit to get along with each other, which is why I’m doing that nonsense with the unicodeScalars.  Anyway, this gives you a single line string with just letters and spaces, so you can then break the string into words.

// Split along spaces
let words = workString.split(separator: " ")

Dictionary now has a feature that allows you to recognize you’re overwriting an existing key and apply a function to a key’s value each time the key is added. It’s called uniquing, and it lets you do neat things like count the number of times a token appears in a sequence:

// Add to dictionary, with "uniquing"
let baseCounts = zip(, repeatElement(1, count: .max))
let wordCounts = Dictionary(baseCounts, uniquingKeysWith: +)

This code creates an infinite sequence of the number 1, and applies addition each time a duplicate key is found. You get exactly the same results by applying + 1 closure, although this is uglier and a little wasteful:

let wordCounts = Dictionary(baseCounts, 
    uniquingKeysWith: { (old, _) in old + 1 })

You can then find the word that appears the most

// Find the word that appears most often
var (maxword, maxcount) = ("UNDEFINED", Int.min)
for (word, count) in wordCounts {
    if count > maxcount { (maxword, maxcount) = (word, count) }
print("\(maxword) appears \(maxcount) times")
// et appears 8 times (at least it did 
// in my much longer text)

You can use uniqueKeysWithValues to fill up a dictionary by zipping two sequences:

let letterOrders = Dictionary(uniqueKeysWithValues: zip("ABCDEFGHIJKLMNOPQRSTUVWXYZ", 1...))
// ["H": 8, "X": 24, "D": 4, "J": 10, "I": 9, "M": 13, "Z": 26,
//  "S": 19, "A": 1, "C": 3, "N": 14, "Y": 25, "R": 18, "G": 7, 
//  "E": 5, "V": 22, "U": 21, "L": 12, "B": 2, "K": 11, "F": 6, 
//  "O": 15, "W": 23, "T": 20, "P": 16, "Q": 17]

Another thing you might do with updated dictionaries is to build a set or array out of sequence values. This next example collects values for each key:

let invitedFriends: [(String, String)] = [
    ("Rizwan", "John"), ("Rizwan", "Abe"),
    ("Soroush", "Dave"), ("Joe", "Dave"), 
    ("Soroush", "Zev"), ("Soroush", "Erica")]
let invitationLists = Dictionary({ ($0.0, [$0.1]) }),
    uniquingKeysWith: { (old: [String], new: [String]) in
        return old + new }
// ["Rizwan": ["John", "Abe"], "Soroush": ["Dave", "Zev", "Erica"], "Joe": ["Dave"]]

You can store a tuple of the maximum and minimum values found for each unique key. The value structure has to be established in the initial streams, which can be ugly:

// Create 100 random numbers
let hundredRandom: [(Int, Int)] = (1...100).map({ _ in let value = Int(arc4random_uniform(10000)); return (value, value) })

// Create ten sequences of 1 through 10
let tens = sequence(state: 1, next: { (value: inout Int) -> Int in
    value += 1; return (value % 10) + 1

// Build the two together
let values = zip(tens, hundredRandom)
let extremes = Dictionary(values, uniquingKeysWith: { (old: (Int, Int), new: (Int, Int)) in
    return (min(old.0, new.0), max(old.1, new.1))
// [10: (504, 8342), 2: (770, 8874), 4: (164, 7871), 9: (177, 8903), 
//  5: (1707, 9627), 6: (577, 8318), 7: (174, 8818), 3: (2837, 9198),
//  8: (3573, 9432), 1: (474, 8652)]

I probably could have made this a little more elegant but I was running out of time because I had to pick up my kids. If you have improvements for the last few examples, let me know. Sorry about the rush.

p.s. Thanks for the tip about using unicodeScalars on char.

One Comment