Does joined() or flatMap(_:) perform better in Swift 3?

arrays swift functional-programming swift3

TL; DR

When it comes just to flattening 2D arrays (without any transformations or separators applied, see @dfri's answer for more info about that aspect), array.flatMap{$0} and Array(array.joined()) are both conceptually the same and have similar performance.

The main difference between flatMap(_:) and joined() (note that this isn't a new method, it has just been renamed from flatten()) is that joined() is always lazily applied (for arrays, it returns a special FlattenBidirectionalCollection<Base>).

Therefore in terms of performance, it makes sense to use joined() over flatMap(_:) in situations where you only want to iterate over part of a flattened sequence (without applying any transformations). For example:

let array2D = [[2, 3], [8, 10], [9, 5], [4, 8]]if array2D.joined().contains(8) {    print("contains 8")} else {    print("doesn't contain 8")}

Because joined() is lazily applied & contains(_:) will stop iterating upon finding a match, only the first two inner arrays will have to be 'flattened' to find the element 8 from the 2D array. Although, as @dfri correctly notes below, you are also able to lazily apply flatMap(_:) through the use of a LazySequence/LazyCollection – which can be created through the lazy property. This would be ideal for lazily applying both a transformation & flattening a given 2D sequence.

In cases where joined() is iterated fully through, it is conceptually no different from using flatMap{$0}. Therefore, these are all valid (and conceptually identical) ways of flattening a 2D array:

array2D.joined().map{$0}

Array(array2D.joined())

array2D.flatMap{$0}

In terms of performance, flatMap(_:) is documented as having a time-complexity of:

O(m + n), where m is the length of this sequence and n is the length of the result

This is because its implementation is simply:

  public func flatMap<SegmentOfResult : Sequence>(    _ transform: (${GElement}) throws -> SegmentOfResult  ) rethrows -> [SegmentOfResult.${GElement}] {    var result: [SegmentOfResult.${GElement}] = []    for element in self {      result.append(contentsOf: try transform(element))    }    return result  }}

As append(contentsOf:) has a time-complexity of O(n), where n is the length of sequence to append, we get an overall time-complexity of O(m + n), where m will be total length of all sequences appended, and n is the length of the 2D sequence.

When it comes to joined(), there is no documented time-complexity, as it is lazily applied. However, the main bit of source code to consider is the implementation of FlattenIterator, which is used to iterate over the flattened contents of a 2D sequence (which will occur upon using map(_:) or the Array(_:) initialiser with joined()).

  public mutating func next() -> Base.Element.Iterator.Element? {    repeat {      if _fastPath(_inner != nil) {        let ret = _inner!.next()        if _fastPath(ret != nil) {          return ret        }      }      let s = _base.next()      if _slowPath(s == nil) {        return nil      }      _inner = s!.makeIterator()    }    while true  }

Here _base is the base 2D sequence, _inner is the current iterator from one of the inner sequences, and _fastPath & _slowPath are hints to the compiler to aid with branch prediction.

Assuming I'm interpreting this code correctly & the full sequence is iterated through, this also has a time complexity of O(m + n), where m is the length of the sequence, and n is the length of the result. This is because it goes through each outer iterator and each inner iterator to get the flattened elements.

So, performance wise, Array(array.joined()) and array.flatMap{$0} both have the same time complexity.

If we run a quick benchmark in a debug build (Swift 3.1):

import QuartzCorefunc benchmark(repeatCount:Int = 1, name:String? = nil, closure:() -> ()) {    let d = CACurrentMediaTime()    for _ in 0..<repeatCount {        closure()    }    let d1 = CACurrentMediaTime()-d    print("Benchmark of \(name ?? "closure") took \(d1) seconds")}let arr = [[Int]](repeating: [Int](repeating: 0, count: 1000), count: 1000)benchmark {    _ = arr.flatMap{$0} // 0.00744s}benchmark {    _ = Array(arr.joined()) // 0.525s}benchmark {    _ = arr.joined().map{$0} // 1.421s}

flatMap(_:) appears to be the fastest. I suspect that joined() being slower could be due to the branching that occurs within the FlattenIterator (although the hints to the compiler minimise this cost) – although just why map(_:) is so slow, I'm not too sure. Would certainly be interested to know if anyone else knows more about this.

However, in an optimised build, the compiler is able to optimise away this big performance difference; giving all three options comparable speed, although flatMap(_:) is still fastest by a fraction of a second:

let arr = [[Int]](repeating: [Int](repeating: 0, count: 10000), count: 1000)benchmark {    let result = arr.flatMap{$0} // 0.0910s    print(result.count)}benchmark {    let result = Array(arr.joined()) // 0.118s    print(result.count)}benchmark {    let result = arr.joined().map{$0} // 0.149s    print(result.count)}

(Note that the order in which the tests are performed can affect the results – both of above results are an average from performing the tests in the various different orders)

arrays swift functional-programming swift3

From the Swiftdoc.org documentation of Array (Swift 3.0/dev) we read [emphasis mine]:
func flatMap<SegmentOfResult : Sequence>(_: @noescape (Element) throws -> SegmentOfResult)
Returns an array containing the concatenated results of calling the given transformation with each element of this sequence.
...
In fact, s.flatMap(transform) is equivalent to Array(s.map(transform).flatten()).
We may also take a look at the actual implementations of the two in the Swift source code (from which Swiftdoc is generated ...)
swift/stdlib/public/core/Join.swift
swift/stdlib/public/core/FlatMap.swift
Most noteably the latter source file, where the flatMap implementations where the used closure (transform) does not yield and optional value (as is the case here) are all described as
/// Returns the concatenated results of mapping `transform` over/// `self`. Equivalent to ////// self.map(transform).joined()
From the above (assuming the compiler can be clever w.r.t. a simple over self { $0 } transform), it would seem as if performance-wise, the two alternatives should be equivalent, but joined does, imo, better show the intent of the operation.
In addition to intent in semantics, there is one apparent use case where joined is preferable over (and not entirely comparable to) flatMap: using joined with it's init(separator:) initializer to join sequences with a separator:
let array = [[1,2,3],[4,5,6],[7,8,9]]let j = Array(array.joined(separator: [42]))print(j) // [1, 2, 3, 42, 4, 5, 6, 42, 7, 8, 9]
The corresponding result using flatMap is not really as neat, as we explicitly need to remove the final additional separator after the flatMap operation (two different use cases, with or without trailing separator)
let f = Array(array.flatMap{ $0 + [42] }.dropLast())print(f) // [1, 2, 3, 42, 4, 5, 6, 42, 7, 8, 9]
See also a somewhat outdated post of Erica Sadun dicussing flatMap vs. flatten() (note: joined() was named flatten() in Swift < 3).
Erica Sadun- Beta 6: flatten #swiftlang

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last