Skip to content

Commit f56d44a

Browse files
committed
[benchmark] Add a basic benchmark for Unicode._CharacterRecognizer
This measures the performance of the stdlib’s core grapheme breaking algorithm, without any `String` overhead.
1 parent d70e16b commit f56d44a

File tree

3 files changed

+100
-0
lines changed

3 files changed

+100
-0
lines changed

benchmark/CMakeLists.txt

+1
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,7 @@ set(SWIFT_BENCH_MODULES
5757
single-source/CharacterLiteralsLarge
5858
single-source/CharacterLiteralsSmall
5959
single-source/CharacterProperties
60+
single-source/CharacterRecognizer
6061
single-source/Chars
6162
single-source/ClassArrayGetter
6263
single-source/CodableTest
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
//===--- StringEdits.swift ------------------------------------------------===//
2+
//
3+
// This source file is part of the Swift.org open source project
4+
//
5+
// Copyright (c) 2014 - 2023 Apple Inc. and the Swift project authors
6+
// Licensed under Apache License v2.0 with Runtime Library Exception
7+
//
8+
// See https://swift.org/LICENSE.txt for license information
9+
// See https://swift.org/CONTRIBUTORS.txt for the list of Swift project authors
10+
//
11+
//===----------------------------------------------------------------------===//
12+
13+
import TestsUtils
14+
15+
public var benchmarks: [BenchmarkInfo] {
16+
guard #available(macOS 9999, iOS 9999, watchOS 9999, tvOS 9999, *) else {
17+
return []
18+
}
19+
return [
20+
BenchmarkInfo(
21+
name: "CharacterRecognizer.mixed",
22+
runFunction: { n in
23+
run(string: mixedString, n: n)
24+
},
25+
tags: [.api, .String],
26+
setUpFunction: { blackHole(mixedString) }),
27+
BenchmarkInfo(
28+
name: "CharacterRecognizer.ascii",
29+
runFunction: { n in
30+
run(string: asciiString, n: n)
31+
},
32+
tags: [.api, .String],
33+
setUpFunction: { blackHole(asciiString) }),
34+
]
35+
}
36+
37+
let mixedString = #"""
38+
The powerful programming language that is also easy to learn.
39+
손쉽게 학습할 수 있는 강력한 프로그래밍 언어.
40+
🪙 A 🥞 short 🍰 piece 🫘 of 🌰 text 👨‍👨‍👧‍👧 with 👨‍👩‍👦 some 🚶🏽 emoji 🇺🇸🇨🇦 characters 🧈
41+
some🔩times 🛺 placed 🎣 in 🥌 the 🆘 mid🔀dle 🇦🇶or🏁 around 🏳️‍🌈 a 🍇 w🍑o🥒r🥨d
42+
Unicode is such fun!
43+
U̷n̷i̷c̷o̴d̴e̷ ̶i̸s̷ ̸s̵u̵c̸h̷ ̸f̵u̷n̴!̵
44+
U̴̡̲͋̾n̵̻̳͌ì̶̠̕c̴̭̈͘ǫ̷̯͋̊d̸͖̩̈̈́ḛ̴́ ̴̟͎͐̈i̴̦̓s̴̜̱͘ ̶̲̮̚s̶̙̞͘u̵͕̯̎̽c̵̛͕̜̓h̶̘̍̽ ̸̜̞̿f̵̤̽ṷ̴͇̎͘ń̷͓̒!̷͍̾̚
45+
U̷̢̢̧̨̼̬̰̪͓̞̠͔̗̼̙͕͕̭̻̗̮̮̥̣͉̫͉̬̲̺͍̺͊̂ͅ\#
46+
n̶̨̢̨̯͓̹̝̲̣̖̞̼̺̬̤̝̊̌́̑̋̋͜͝ͅ\#
47+
ḭ̸̦̺̺͉̳͎́͑\#
48+
c̵̛̘̥̮̙̥̟̘̝͙̤̮͉͔̭̺̺̅̀̽̒̽̏̊̆͒͌̂͌̌̓̈́̐̔̿̂͑͠͝͝ͅ\#
49+
ö̶̱̠̱̤̙͚͖̳̜̰̹̖̣̻͎͉̞̫̬̯͕̝͔̝̟̘͔̙̪̭̲́̆̂͑̌͂̉̀̓́̏̎̋͗͛͆̌̽͌̄̎̚͝͝͝͝ͅ\#
50+
d̶̨̨̡̡͙̟͉̱̗̝͙͍̮͍̘̮͔͑\#
51+
e̶̢͕̦̜͔̘̘̝͈̪̖̺̥̺̹͉͎͈̫̯̯̻͑͑̿̽͂̀̽͋́̎̈́̈̿͆̿̒̈́̽̔̇͐͛̀̓͆̏̾̀̌̈́̆̽̕ͅ
52+
"""#
53+
54+
let _asciiString = #"""
55+
Swift is a high-performance system programming language. It has a clean
56+
and modern syntax, offers seamless access to existing C and Objective-C code
57+
and frameworks, and is memory safe by default.
58+
59+
Although inspired by Objective-C and many other languages, Swift is not itself
60+
a C-derived language. As a complete and independent language, Swift packages
61+
core features like flow control, data structures, and functions, with
62+
high-level constructs like objects, protocols, closures, and generics. Swift
63+
embraces modules, eliminating the need for headers and the code duplication
64+
they entail.
65+
66+
Swift toolchains are created using the script
67+
[build-toolchain](https://github.com/apple/swift/blob/main/utils/build-toolchain).
68+
This script is used by swift.org's CI to produce snapshots and can allow for
69+
one to locally reproduce such builds for development or distribution purposes.
70+
A typical invocation looks like the following:
71+
72+
```
73+
$ ./swift/utils/build-toolchain $BUNDLE_PREFIX
74+
```
75+
76+
where ``$BUNDLE_PREFIX`` is a string that will be prepended to the build date
77+
to give the bundle identifier of the toolchain's ``Info.plist``. For instance,
78+
if ``$BUNDLE_PREFIX`` was ``com.example``, the toolchain produced will have
79+
the bundle identifier ``com.example.YYYYMMDD``. It will be created in the
80+
directory you run the script with a filename of the form:
81+
``swift-LOCAL-YYYY-MM-DD-a-osx.tar.gz``.
82+
"""#
83+
let asciiString = String(repeating: _asciiString, count: 10)
84+
85+
@available(macOS 9999, iOS 9999, watchOS 9999, tvOS 9999, *)
86+
func run(string: String, n: Int) {
87+
var state = Unicode._CharacterRecognizer()
88+
var c = 0
89+
for _ in 0 ..< n {
90+
for scalar in string.unicodeScalars {
91+
if state.hasBreak(before: scalar) {
92+
c += 1
93+
}
94+
}
95+
}
96+
blackHole(c)
97+
}

benchmark/utils/main.swift

+2
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,7 @@ import ChainedFilterMap
4545
import CharacterLiteralsLarge
4646
import CharacterLiteralsSmall
4747
import CharacterProperties
48+
import CharacterRecognizer
4849
import Chars
4950
import ClassArrayGetter
5051
import CodableTest
@@ -230,6 +231,7 @@ register(ChainedFilterMap.benchmarks)
230231
register(CharacterLiteralsLarge.benchmarks)
231232
register(CharacterLiteralsSmall.benchmarks)
232233
register(CharacterProperties.benchmarks)
234+
register(CharacterRecognizer.benchmarks)
233235
register(Chars.benchmarks)
234236
register(CodableTest.benchmarks)
235237
register(Combos.benchmarks)

0 commit comments

Comments
 (0)