Pattern match style filtering

I’ve written about this before, but a question came up recently that I thought was worth posting, as it’s a much simpler case than the one I wrote about last year.

Byrre_b asks:

Is there any way to write “pattern matching style filtering” in a better way then using a complete `if case` statement?

Such as:

let values: [NonEquatableEnum] = [...]
let filtered = values.filter { val in
    if case .thatOneInterestingValue = val {
        return true
    }
    return false
}

Note: Several people have pointed out if the enumeration is equatable, just use == rather than pattern matching. You can match case, even with associated values with, e.g. if case .foo = value.

You can filter using the pattern match operator, as shown here, or for equatable enumerations with ==.

enum NonEquatableEnum { case nah, blah, thatOneInterestingValue }

let values: [NonEquatableEnum] = [.nah, .blah, .nah, .thatOneInterestingValue, .nah, .thatOneInterestingValue, .blah]
let filtered = values.filter({ $0 ~= .thatOneInterestingValue })

Although this stores all values matching your subject case into filtered, the results aren’t very meaningful unless you want to count how many instances of .thatOneInterestingValue appear. That’s because filtering by enumeration case is usually limited to two situations:

  • You’re working with a structure and using the enumeration as a tag for filtering
  • You’re working with associated values and want to collect the enumeration cases and then extract the values.

The first of these is made simple with Swift 4 key paths. For example, consider the following structure:

struct Foo {
    var (x, y, z) = (0, 0, 0)
    let numnum: NonEquatableEnum
    init(_ n: NonEquatableEnum) { self.numnum = n }
}

let values2: [Foo] = [Foo(.nah), Foo(.blah), Foo(.nah), Foo(.thatOneInterestingValue), Foo(.nah), Foo(.thatOneInterestingValue), Foo(.blah)]

Assuming each instance has some more interesting data than the default (0, 0, 0) triple, pull out tagged instances using the same filter approach:

let kp = \Foo.numnum
let filtered2 = values2.filter({ $0[keyPath: kp] ~= .thatOneInterestingValue })

The key path lets you “dive” into each struct to test the enumeration member, while preserving the data stored in the other structure members. Instead of just counting how many instances of a simple enumeration there are, it acts as a meaningful filtering operation.

The second challenge, retrieving associated values, is more complex, as explained in my original write-up. Hand-crafting a result with if case gets you the values you need.

enum MoreComplicated {
    case one(Int)
    case two(Int)
    case three(String, String)
}

let values3: [MoreComplicated] = [.one(3), .two(5), .two(2), .three("hi", "there")]

Here’s an example that pulls out the case three enumerations:

let results2 = values3.filter({
    if case .three = $0 { return true } else { return false }
})

If you want to filter and extract at the same time,  add let declarations into your if case statement and switch the filter operation to a flatMap :

let results3 = values3.flatMap({
    (value: MoreComplicated) -> (String, String)? in 
        guard case .three(let x, let y) = value
            else { return nil }
        return (x, y)
    })

This returns an array of tuples, containing the associated values for each matching enumeration case.

Thoughts? Improvements? Fixes? Drop a note, tweet, or email to let me know!

Advent day 3, part 1

Are you playing along with Advent of Code? I got started a little late this year, so I’m doing a couple of days each day until I catch up.

The basis of Advent of Code isn’t so much the beauty of your code as it is the correctness. Because of this, a lot of my code is hideously ugly, with design choices best described as “questionable”. For example, at one point, I used flatMap.count on returned optionals instead of returning zeroes and ones and reducing them with +. I’m using Swift because that’s the language that’s currently dominant in my brain, although a lot of the memory manipulation would have been easier with straight C.

I wanted to share my solution for the first half of day 3. The challenge stipulates a squared spiral pattern of numbers in a grid and then asks you to calculate the distance from each number to the center. The approach I came up with for part I proved completely useless for the second half of day 3, where I had to start over from scratch. (Normally, you just modify the first code with an extra function and you go from part I to part II pretty easily.)

I took a road with part I that really didn’t fit the mindset of the challenge givers. For me, I was thinking geometrically, noticing that this was a degenerate case of a series of concentric circles. However instead of calculating the distance with a sin-cos solution, the concentric squares created a stepped triangle wave instead. Because of this, I built my solution to traverse the triangle wave and deduce the distance as the radius + the phase of the wave for any given number.

I thought I’d share my code for this because it’s pretty short and I think it’s pretty unusual for the problem domain.

for n in [1024, 12, 23, 312051] {
    for i in sequence(first: 1, next: { $0 + 2 }) {
        // Find least bounding square that contains number
        let boundingSize = i * i
        if boundingSize >= n {
            // Start of the outer edge sequence
            let firstValue = (i - 2) * (i - 2) + 1
            
            // Calculate distance to center
            let radius = (i - 1) / 2
            
            // The first number is always positioned at the
            // bottom right, one up from the bottom
            var current = radius - 1
            
            // The `direction` differential is the basis of
            // the triangle wave generator. It flips at the
            // minimum (-radius) and maximum (radius) offsets
            var direction = -1
            
            // One trip through the outer edge will be
            // sufficient to find the value
            for value in firstValue ... boundingSize {
                if value == n {
                    print("Result for \(n) is \(radius + abs(current))")
                    break
                }
                
                // Generate the next value in the triangle
                // wave pattern to determine the distance
                // from the center
                current += direction
                if abs(current) == radius { direction *= -1 }
            }
            break
        }
    }
}

Needless to say, my solution for part II had nothing to do with this wave approach. Instead, I actually constructed the 2D array, populated it with numbers and then used those as an addressing scheme to collect sums. Outside the addressing, it was basically applying a 3×3 sum filter in an address-driven convolution. Much longer, not pretty, but reasonably effective.

In the end, if I had done my “fill then look up locations” approach I used for part II for the first puzzle, it would have provided a much quicker solution although I don’t think it would have been as short or, in my opinion, nifty.

What approach did you end up using for day 3? And what made you go with that design?

How to check your security update

A macOS Security flaw opened access to users who didn’t have root passwords. So Apple updated computers overnight

Unfortunately Security Update 2017-001 turned out to bork file sharing, so Apple updated the problem both by issuing repair instructions and updating the patch.

To check whether you have the proper build, choose Apple Menu () > About This Mac. Click the System Report button and scroll down to Software. Click the word Software. You should be running 17B1003.

Thanks everyone.

p.s. Esopus Spitzenburg is my Mac mini. My MBP is Broxwood Foxwhelp. And yes, I’ve long since gone past Fuji, Gala, Rome, Honeycrisp, Pippin, Winter Banana, and many other varietals.

Building automatic `OptionSet` entries

Last night Zev Eisenberg was asking about option sets. “Do you still have to specify 1 << _n_ manually for OptionSet conformance? There’s no magic?” So I decided to build him some magic. There’s really no reason you should have to manually put in numbers like this:

public struct Traits: OptionSet {
    public typealias RawValue = Int
    public var rawValue = 0
    public init(rawValue: Traits.RawValue) {
        self.rawValue = rawValue
    }
    
    public static let bolded = 1 << 0
    public static let italicized = 1 << 1
    public static let monospaced = 1 << 2
    public static let underlined = 1 << 3
    public static let outlined = 1 << 4
}

This approach requires unnecessary bookkeeping. You have to keep track of the bits you’ve used, especially if you add or insert new options, or reorder the options  you have. It gives unnecessary prominence to the implementation detail values. There should be a more magic way.

So I decided to write him a solution that automatically generated the bit flags and hid those details from the implementation. The result looks like this:

 public static let bolded = generateOption()
 public static let italicized = generateOption()
 public static let monospaced = generateOption()
 public static let underlined = generateOption()
 public static let outlined = generateOption()

You can move things around, add new items, delete old items. It really doesn’t make a difference from a code maintenance point of view (assuming you’re doing this all during development, not after deployment, where you’d want to use availability and deprecations).

To get here, I needed to create a function that would add type-specific options to any type that conforms to OptionSet. I created a global dictionary to store option counts:

private var _optionSetDict: [AnyHashable: Int] = [:]

To be honest, I hate unparented globals like this. However, Swift does not allow adding static stored values in extensions. I couldn’t think of another better way to handle this. I also built a second global to ensure this dictionary would prevent concurrent access, so my counts would be secure:

private var _optionSetAccessQueue = DispatchQueue(
    label: "sadun.OptionSetGeneration", attributes: [.concurrent])

I needed to box my type references since Swift doesn’t allow types to conform to Hashable. They won’t work out of the box with dictionaries. This solution let me use types as keys:

/// Wraps a type to enable it for use as a dictionary key
public struct TypeWrapper<Wrapped>: Hashable {
    private let type: Wrapped.Type
    public init(_ type: Wrapped.Type) {
        self.type = type
    }
    
    /// Conforms to Equatable
    public static func ==(lhs: TypeWrapper, rhs: TypeWrapper) -> Bool {
        return lhs.type == rhs.type
    }
    
    /// Conforms to Hashable
    public var hashValue: Int {
        return ObjectIdentifier(type).hashValue
    }
}

To create a hashable type entry, I just instantiate TypeWrapper with the type.

Sven Weidauer points out I can use ObjectIdentifier directly
Here’s the OptionSet extension that implements the generateOption() magic:

public extension OptionSet where RawValue == Int {
    public static func generateOption() -> Self { 
        let key = ObjectIdentifier(Self.self)
        return _optionSetAccessQueue.sync(flags: [.barrier]) {
            // This should be locked so there's a guarantee that
            // counts are unique for each generated option
            let count = _optionSetDict[key, default: 0]
            _optionSetDict[key] = count + 1
            return .init(rawValue: 1 << count)
        }
    }
}

I’m not sure that I’d ever actually use this approach in code but it was a fun exercise in problem solving. Sven W. adds “Another thing to keep in mind is that statics are initialised the first time they are used. So in different runs of the program the values can differ. Better not persist OptionSets created by this technique.”

You can see the full implementation over at Github. And if you’re curious, you can go back through the change history to see some earlier takes on the problem.

Like it? Hate it? Got suggestions and improvements? (I always mess something up, so there’s a pretty much 100% chance there’s room for improvement.) Drop a note, a tweet, an email, or a comment.

Thanks to Ian Keen, who suggested extending OptionSet directly.

Black Friday Sale: Swift Style with a Discount

Happy Thanksgiving Week Everyone!

If you’ve been wanting to pick up an e-copy of Swift Style, there isn’t a better time. Starting tomorrow 11/22 through Friday 12/1, the book is on sale for 40% off. Use this coupon code turkeysale2017 to claim the discount!

So travel safely over the holidays, pick up some great reading, and accept my gratitude for being part of this amazing community.

Update: The PragProg website has been unresponsive. Brian MacDonald writes: “[W]e’re sorry about the delay. The sale runs until December 1, so feel free to check back in a little while.”

Device-only code: A polite request for help

You may not follow Swift Evolution. A lot of people don’t. It takes a lot of time and attention, and the signal-to-noise while good for lists of its type can be low for people with deadlines, managers, contracts, and real life.

So let me get to the point: do you have code that doesn’t or shouldn’t run in simulators? Maybe you’re building AVFoundation camera code or Metal or for the keychain or whatever? Instead of using tests like:

// Test for a simulator destination
#if (arch(i386) || arch(x86_64)) && (!os(macOS))
    // code suitable for simulator
#else
    // code suitable for device
#endif

wouldn’t it be a lot easier, more robust, and better for cross-platform development to have something simple like this?

// Test for a simulator destination 
#if targetEnvironment(simulator)
    // code suitable for simulator 
#else
    // code suitable for device 
#endif

If you think so, please send a note to kremenek@apple.com and mention your support of SE-0190. You can read the whole proposal at that link. Big thanks go to Graydon Hoare for his implementation.

Time is running short. Review ends on the 24th. This is very much a developer-driven proposal as opposed to a language-design proposal. I’d like to see Swift users have more presence in the SE community.

Thank you.

Styling an optional `Bool`

So  Russell Finn asked me today if I had any thoughts on styling an Optional<Bool>. His use case is something like this: what if you have an optional variable like Optional<NSWindow>, with a property or method on that type, and want to test against the one case where the value is not-nil and the property or method on that value is meaningful?

For example, what if you have a possible window that’s possibly zoomed? How would you style that in the most readable fashion? There are, of course, a lot of ways to express this:

if let zoomed = window?.isZoomed, zoomed { ... }
guard let window = window, window.isZoomed else { ... }
if let window = window {
    if window.isZoomed { ... }
}
guard let window = window else { ... }; 
    if window.isZoomed { ... }

and so forth.

I suppose the thing you should be asking yourself is how long the window is of interest. If the only goal is to dezoom it if it’s zoomed, the if approach makes sense. However, if you plan to perform multiple steps, then you’ll want to unwrap the window and then work directly with the unwrapped version.

There’s nothing particularly terrible about stacking conditions into a compound guard or if statement so long as the conditions are related and tell a single step-by-step story. But if they don’t, I’d recommend breaking them down, as in the second two examples.

If it’s a one and done (thanks Dave and Greg), you can compare directly with a truth value:

if window?.isZoomed == true { ... }

I’m afraid, I’d have to know a bit more about the exact circumstances of use to have a stronger opinion. I hope this helps.

Fixing Mail Plugins for High Sierra

Are all your flagged emails back? Do you want them gone? Mail plugins help make macOS mail a bit less awful. If your Mail bundles stopped working, it’s not hard to get them back up and running. The secret lies in adding a `Supported10.13PluginCompatibilityUUIDs` entry in to any mail plugin bundle’s internal Info.plist file. You can then move the plugin back from the disabled folder to the main plugin folder.

The overall technique changed in 10.12, requiring a separate entry field that’s OS-specific. You can pull the current compatibility key by issuing:

defaults read /Applications/Mail.app/Contents/Info.plist PluginCompatibilityUUID

This reads the compatibility UUID and prints it to the terminal command line. You need this UUID to edit the Info.plist file. Starting in 10.12, you need to use the XX.YY format as part of the key name. For example, here’s what the 10.13 version looks like:

<key>Supported10.13PluginCompatibilityUUIDs</key>
<array>
	<string>CompatibilityKeyHere</string>
</array>

Once you’ve done this, move the bundle back to the right folder and restart mail. The bundle should hopefully continue working.

How to `split` without consuming values

Swift’s built-in split function returns an array of subsequences, split from a sequence or collection’s elements. For example, you may have a string and want to split it into words. Here’s a simple approach that consumes whitespace, newline, and punctuation characters:

import Foundation

let string = """
    Lorem ipsum dolor sit amet, consectetur adipiscing elit.
    Praesent in purus a ante semper congue posuere at lacus.
    Nullam faucibus sem vel sem vestibulum, a ullamcorper nunc
    auctor.  
    """

extension Character {
    var firstScalar: UnicodeScalar {
        return self.unicodeScalars.first ??  UnicodeScalar(0)
    }
}

// NOTE: This code demonstrates a simple way to split a collection.
// It's not a great way to actually tokenize a string, lacking both
// localization and in-word punctuation handling such as
// hyphens and single quotes.
let targetSet = CharacterSet.whitespacesAndNewlines
    .union(CharacterSet.punctuationCharacters)
let isWordSeparator: (Character) -> Bool = { c in targetSet.contains(c.firstScalar) }
let words = string.split(whereSeparator: isWordSeparator)

But what happens when you want to split a sequence or collection without consuming values? There’s no built-in Swift solution to turn to.

A few months ago, Soroush Khanlou and I were playing around with this problem, trying to return subsequences around predicate boundaries. When the predicate failed, a new subsequence would begin. We wanted to break sequences  along logical lines, for example, where a value changed or a slope trend updated from increasing to decreasing:

let x = [1, 2, 2, 3, 3, 3, 1]
 .sliced(where: { $0 != $1 })
// [ArraySlice([1]), ArraySlice([2, 2]), ArraySlice([3, 3, 3]), ArraySlice([1])]
let z = [1, 2, 2, 1, 3, 3, 1].sliced(where: >)
// [ArraySlice([1, 2, 2]), ArraySlice([1, 3, 3]), ArraySlice([1])]

We decided to work with arrays and use ArraySlice instances. Each ArraySlice provides a view onto a larger array:

Instead of copying over the elements of a slice to new storage, an `ArraySlice` instance presents a view onto the storage of a larger array. And because `ArraySlice` presents the same interface as `Array`, you can generally perform the same operations on a slice as you could on the original array.

Slices don’t allocate new storage so the splits are efficient. Be careful not to hold onto slices long term. Each slice holds a reference to the array it describes, so you can run into memory management issues if the slice outlives the array it points to.

For safety, reference the start and end of each slice with startIndex and endIndex. These values are often not zero or the array count, the way you’d expect with actual array instances:

let array = ["a", "b", "c", "d", "e"]
let slice = array[1...3]
print(slice.startIndex) // 1
print(slice.endIndex) // 4

Here’s the code I eventually settled on. It extends ArraySlice with a recursive slicing function that ends when it runs out of data or exceeds the maximum partition count. For each slice, it walks the elements until it finds a member element that fails the predicate test, at which point it recurses with a new slice. The ArraySlice code is hidden behind an Array entry point, which wraps the slice functionality away from public view.

Here are some examples of how the code could be used to break apart arrays. They show creating subsequences and run length encoding. You could easily adapt this to more complex data. For example, you might use keypaths to chunk data structures based on time stamps or look for inflection points for rising and falling values.

// Same value
let x = [1, 2, 2, 3, 3, 3, 1]
    .sliced(where: { $0 != $1 })
// [ArraySlice([1]), ArraySlice([2, 2]), ArraySlice([3, 3, 3]), ArraySlice([1])]

// Run-length encoding
let xx = [1, 2, 2, 3, 3, 3, 3, 3, 1, 1, 2, 1, 1]
  .sliced(where: !=)
  .map { ($0.count, $0.first!) }
// [(1, 1), (2, 2), (5, 3), (2, 1), (1, 2), (2, 1)]

// Strings
let y = Array("aaaabbbbcccdef")
  .sliced(where: !=)
  .map({ String($0) })
// ["aaaa", "bbbb", "ccc", "d", "e", "f"]

// Increasing subsequences
let z = [1, 2, 2, 1, 3, 3, 1].sliced(where: >)
// [ArraySlice([1, 2, 2]), ArraySlice([1, 3, 3]), ArraySlice([1])]

// Decreasing subsequences
let w = [1, 2, 2, 1, 3, 3, 1].sliced(where: <)
// [ArraySlice([1]), ArraySlice([2, 2, 1]), ArraySlice([3, 3, 1])]

Got suggestions for improvements? Drop a note in the comments. Thanks!

Matching Regular Expressions

Last week, I wrote about Olivier Halligon’s elegant Set matching solution. Several people asked if this could be extended to pattern matching for other types, which it can. Today, I put together a little proof of concept for performing regular expression matching with strings.

I started off with the same struct plus stand-alone global pseudo-constructor approach I had used with sets. Doing so, let me have a RegexPatternMatcher type and implement a matching function to hide a constructor. After a bit of playing around,  I dropped the global function to use a normal initializer: RegexPatternMatcher("pattern") vs matching("pattern"). The call sites were longer but less garbage floating around.

Renaming the type to Regex provided a more succinct approach : Regex("pattern"). I don’t normally like using overly short type names to balance call sites but I felt that Regex said about all that needed to be spoken here:

I’m not a huge fan of NSRegularExpression. I can’t wait for a native Swift solution. The class is expensive and it relies on what feels like archaic string models that use NSRange. Olivier suggested I balance the expense with NSCache so I created a variation where converted my struct to a class, and used NSString patterns as the cache keys.

This allowed me to build each NSRegularExpression instance once and decouple the matching options from the pattern. I made the class match function ignore defaulted option arguments. Instead, I allow the cache to carry forward any matching policies that previously existed or overwrite those policies if you specify them explicitly.

There’s something wrong with my code because I could not get a cached regex to change its matching behavior even after I updated options. Look at lines 103 through 107 in this version. If you figure out what’s wrong, let me know. It returns true for all six tests, instead of four true then two false.

Anyway, it was an interesting exercise and a good way to start getting back to work as I recuperate.