Skip to content

Commit 4e51810

Browse files
committed
Optimize mbstring upper/lowercasing: use fast path in more cases
The 'fast path' in the uppercase/lowercase functions for Unicode text can be used for a slightly greater range of characters. This is not expected to have a big impact on performance, since the number of characters which will use the 'fast path' is only increased by about 50-60, and these are not very commonly used characters... but still, it doesn't cost anything.
1 parent 36c979e commit 4e51810

File tree

1 file changed

+6
-2
lines changed

1 file changed

+6
-2
lines changed

ext/mbstring/php_unicode.c

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -121,7 +121,9 @@ static inline unsigned mph_lookup(
121121

122122
static unsigned php_unicode_toupper_raw(unsigned code, enum mbfl_no_encoding enc)
123123
{
124-
if (code < 0x80) {
124+
/* After the ASCII characters, the first codepoint with an uppercase version
125+
* is 0xB5 (MICRO SIGN) */
126+
if (code < 0xB5) {
125127
/* Fast path for ASCII */
126128
if (code >= 0x61 && code <= 0x7A) {
127129
if (UNEXPECTED(enc == mbfl_no_encoding_8859_9 && code == 0x69)) {
@@ -141,7 +143,9 @@ static unsigned php_unicode_toupper_raw(unsigned code, enum mbfl_no_encoding enc
141143

142144
static unsigned php_unicode_tolower_raw(unsigned code, enum mbfl_no_encoding enc)
143145
{
144-
if (code < 0x80) {
146+
/* After the ASCII characters, the first codepoint with a lowercase version
147+
* is 0xC0 (LATIN CAPITAL LETTER A WITH GRAVE) */
148+
if (code < 0xC0) {
145149
/* Fast path for ASCII */
146150
if (code >= 0x41 && code <= 0x5A) {
147151
if (UNEXPECTED(enc == mbfl_no_encoding_8859_9 && code == 0x0049L)) {

0 commit comments

Comments
 (0)