#
23f99f08 |
| 26-Jun-2024 |
Ayesh Karunaratne |
ext/mbstring: update UCD parser to accept characters with multiple properties
|
Revision tags: php-8.2.0RC1, php-8.1.10, php-8.0.23, php-8.0.23RC1, php-8.1.10RC1, php-8.2.0beta3, php-8.2.0beta2, php-8.1.9, php-8.0.22, php-8.1.9RC1, php-8.2.0beta1, php-8.0.22RC1, php-8.0.21, php-8.1.8, php-8.2.0alpha3, php-8.1.8RC1, php-8.2.0alpha2, php-8.0.21RC1, php-8.0.20, php-8.1.7, php-8.2.0alpha1, php-7.4.30, php-8.1.7RC1, php-8.0.20RC1, php-8.1.6, php-8.0.19, php-8.1.6RC1, php-8.0.19RC1, php-8.0.18, php-8.1.5, php-7.4.29, php-8.1.5RC1, php-8.0.18RC1, php-8.1.4, php-8.0.17, php-8.1.4RC1, php-8.0.17RC1, php-8.1.3, php-8.0.16, php-7.4.28, php-8.1.3RC1, php-8.0.16RC1, php-8.1.2, php-8.0.15, php-8.1.2RC1, php-8.0.15RC1, php-8.0.14, php-8.1.1, php-7.4.27, php-8.1.1RC1, php-8.0.14RC1, php-7.4.27RC1, php-8.1.0, php-8.0.13, php-7.4.26, php-7.3.33, php-8.1.0RC6, php-7.4.26RC1, php-8.0.13RC1, php-8.1.0RC5, php-7.3.32, php-7.4.25, php-8.0.12, php-8.1.0RC4, php-8.0.12RC1, php-7.4.25RC1, php-8.1.0RC3 |
|
#
0b32a15e |
| 22-Sep-2021 |
Alex Dowad |
Optimize mb_str{,im}width for performance Rather than doing a linear search of a table of fullwidth codepoint ranges for every input character, 1) Short-cut the search if the co
Optimize mb_str{,im}width for performance Rather than doing a linear search of a table of fullwidth codepoint ranges for every input character, 1) Short-cut the search if the codepoint is below the first such range 2) Otherwise, do a binary (rather than linear) search
show more ...
|
Revision tags: php-8.0.11, php-7.4.24, php-7.3.31, php-8.1.0RC2, php-7.4.24RC1, php-8.0.11RC1, php-8.1.0RC1 |
|
#
425c2e3b |
| 24-Aug-2021 |
Nikita Popov |
Combine control into one character group Same as with punct, we're currently not interested in distinguishing between Cc and Cf, so only store their union. |
#
f458b160 |
| 24-Aug-2021 |
Nikita Popov |
Combine punctuation into one character group We're not currently interested in distinguishing between individual punctuation types, so just merge everything into one general category
Combine punctuation into one character group We're not currently interested in distinguishing between individual punctuation types, so just merge everything into one general category to make the property lookup more efficient.
show more ...
|
Revision tags: php-7.4.23, php-8.0.10 |
|
#
3be94217 |
| 24-Aug-2021 |
Nikita Popov |
Don't use sentinel value for unicode property lookup 0xffff was used to mark character properties without any members. This made the code unnecessarily complicated, because we need to
Don't use sentinel value for unicode property lookup 0xffff was used to mark character properties without any members. This made the code unnecessarily complicated, because we need to check for 0xffff values when looking up the property ranges. We can simply encode this as an empty set of ranges.
show more ...
|
Revision tags: php-7.3.30, php-8.1.0beta3, php-8.0.10RC1, php-7.4.23RC1, php-8.1.0beta2, php-8.0.9, php-7.4.22, php-8.1.0beta1, php-7.4.22RC1, php-8.0.9RC1, php-8.1.0alpha3, php-7.4.21, php-7.3.29, php-8.0.8, php-8.1.0alpha2, php-7.4.21RC1, php-8.0.8RC1, php-8.1.0alpha1, php-8.0.7, php-7.4.20, php-8.0.7RC1, php-7.4.20RC1, php-8.0.6, php-7.4.19, php-7.4.18, php-7.3.28, php-8.0.5, php-8.0.5RC1, php-7.4.18RC1, php-8.0.4RC1, php-7.4.17RC1, php-8.0.3, php-7.4.16, php-8.0.3RC1, php-7.4.16RC1, php-8.0.2, php-7.4.15, php-7.3.27, php-8.0.2RC1, php-7.4.15RC2, php-7.4.15RC1, php-8.0.1, php-7.4.14, php-7.3.26, php-7.4.14RC1, php-8.0.1RC1, php-7.3.26RC1, php-8.0.0, php-7.3.25, php-7.4.13, php-8.0.0RC5, php-7.4.13RC1, php-8.0.0RC4, php-7.3.25RC1, php-7.4.12, php-8.0.0RC3, php-7.3.24, php-8.0.0RC2, php-7.4.12RC1, php-7.3.24RC1, php-7.2.34, php-8.0.0rc1, php-7.4.11, php-7.3.23 |
|
#
d8c785b8 |
| 24-Sep-2020 |
Alex Dowad |
Update 'East Asian Width' table to comply with Unicode 13.0 Instead of manually maintaining the data in eaw_table.h, it is now automatically generated by ucgendat/ucgendat.php, using the
Update 'East Asian Width' table to comply with Unicode 13.0 Instead of manually maintaining the data in eaw_table.h, it is now automatically generated by ucgendat/ucgendat.php, using the EastAsianWidth.txt file from the Unicode Consortium. Something must be said about the deleted test case. Back in 2004, someone noticed that `mb_strwidth` didn't comply with Unicode 4.0. A test case was added to expose the problem. Well, time keeps moving on, and with the changing years, new Unicodes are born and old Unicodes die. Some characters which were counted as double-width in Unicode 4.0 are no longer such in Unicode 13.0, which renders the test case obsolete. At the same time, make a couple of spelling/grammar fixes in ucgendat.php.
show more ...
|
Revision tags: php-8.0.0beta4, php-7.4.11RC1, php-7.3.23RC1, php-8.0.0beta3, php-7.4.10, php-7.3.22, php-8.0.0beta2, php-7.3.22RC1, php-7.4.10RC1, php-8.0.0beta1, php-7.4.9, php-7.2.33, php-7.3.21, php-8.0.0alpha3, php-7.4.9RC1, php-7.3.21RC1, php-7.4.8, php-7.2.32, php-8.0.0alpha2, php-7.3.20, php-8.0.0alpha1, php-7.4.8RC1, php-7.3.20RC1, php-7.4.7, php-7.3.19, php-7.4.7RC1, php-7.3.19RC1, php-7.4.6, php-7.2.31, php-7.4.6RC1, php-7.3.18RC1, php-7.2.30, php-7.4.5, php-7.3.17, php-7.4.5RC1, php-7.3.17RC1, php-7.3.18, php-7.4.4, php-7.2.29, php-7.3.16, php-7.4.4RC1, php-7.3.16RC1, php-7.4.3, php-7.2.28, php-7.3.15RC1, php-7.4.3RC1, php-7.3.15, php-7.2.27, php-7.4.2, php-7.3.14, php-7.3.14RC1, php-7.4.2RC1, php-7.4.1, php-7.2.26, php-7.3.13, php-7.4.1RC1, php-7.3.13RC1, php-7.2.26RC1, php-7.4.0, php-7.2.25, php-7.3.12, php-7.4.0RC6, php-7.3.12RC1, php-7.2.25RC1, php-7.4.0RC5, php-7.1.33, php-7.2.24, php-7.3.11, php-7.4.0RC4, php-7.3.11RC1, php-7.2.24RC1, php-7.4.0RC3, php-7.2.23, php-7.3.10, php-7.4.0RC2, php-7.2.23RC1, php-7.3.10RC1, php-7.4.0RC1, php-7.1.32, php-7.2.22, php-7.3.9, php-7.4.0beta4, php-7.2.22RC1, php-7.3.9RC1, php-7.4.0beta2, php-7.1.31, php-7.2.21, php-7.3.8, php-7.4.0beta1, php-7.2.21RC1, php-7.3.8RC1, php-7.4.0alpha3, php-7.3.7, php-7.2.20, php-7.4.0alpha2, php-7.3.7RC3, php-7.3.7RC2, php-7.2.20RC2, php-7.4.0alpha1, php-7.3.7RC1, php-7.2.20RC1, php-7.2.19, php-7.3.6, php-7.1.30, php-7.2.19RC1, php-7.3.6RC1, php-7.1.29, php-7.2.18, php-7.3.5 |
|
#
36c79465 |
| 20-Apr-2019 |
Peter Kokot |
Move ucgendata README to generator file header |
Revision tags: php-7.2.18RC1, php-7.3.5RC1, php-7.2.17, php-7.3.4, php-7.1.28, php-7.3.4RC1, php-7.2.17RC1, php-7.1.27, php-7.3.3, php-7.2.16, php-7.3.3RC1, php-7.2.16RC1, php-7.2.15, php-7.3.2, php-7.2.15RC1, php-7.3.2RC1, php-5.6.40, php-7.1.26, php-7.3.1, php-7.2.14, php-7.2.14RC1, php-7.3.1RC1, php-5.6.39, php-7.1.25, php-7.2.13, php-7.0.33, php-7.3.0, php-7.1.25RC1, php-7.2.13RC1, php-7.3.0RC6, php-7.1.24, php-7.2.12, php-7.3.0RC5, php-7.1.24RC1, php-7.2.12RC1, php-7.3.0RC4 |
|
#
37c329d7 |
| 13-Oct-2018 |
Peter Kokot |
Trim trailing whitespace in source code files |
Revision tags: php-7.1.23, php-7.2.11, php-7.3.0RC3, php-7.1.23RC1, php-7.2.11RC1, php-7.3.0RC2, php-5.6.38, php-7.1.22, php-7.3.0RC1, php-7.2.10, php-7.0.32 |
|
#
02294f0c |
| 29-Aug-2018 |
Peter Kokot |
Make PHP development tools files and scripts executable This patch makes several scripts and PHP development tools files executable and adds more proper shebangs to the PHP scripts.
Make PHP development tools files and scripts executable This patch makes several scripts and PHP development tools files executable and adds more proper shebangs to the PHP scripts. The `#!/usr/bin/env php` shebang provides running the script via `./script.php` and uses env to find PHP script location on the system. At the same time it still provides running the script with a user defined PHP location using `php script.php`.
show more ...
|
Revision tags: php-7.1.22RC1, php-7.3.0beta3, php-7.2.10RC1, php-7.1.21, php-7.2.9, php-7.3.0beta2, php-7.1.21RC1, php-7.3.0beta1, php-7.2.9RC1, php-5.6.37, php-7.1.20, php-7.3.0alpha4, php-7.0.31, php-7.2.8, php-7.1.20RC1, php-7.2.8RC1, php-7.3.0alpha3, php-7.3.0alpha2, php-7.1.19, php-7.2.7, php-7.1.19RC1, php-7.3.0alpha1, php-7.2.7RC1, php-7.1.18, php-7.2.6, php-7.2.6RC1, php-7.1.18RC1, php-5.6.36, php-7.2.5, php-7.1.17, php-7.0.30, php-7.1.17RC1, php-7.2.5RC1, php-5.6.35, php-7.0.29, php-7.2.4, php-7.1.16, php-7.1.16RC1, php-7.2.4RC1, php-7.1.15, php-5.6.34, php-7.2.3, php-7.0.28, php-7.2.3RC1, php-7.1.15RC1, php-7.1.14, php-7.2.2, php-7.1.14RC1, php-7.2.2RC1, php-7.1.13, php-5.6.33, php-7.2.1, php-7.0.27, php-7.2.1RC1, php-7.1.13RC1, php-7.0.27RC1, php-7.2.0, php-7.1.12, l, php-7.1.12RC1, php-7.2.0RC6, php-7.0.26RC1, php-7.1.11, php-5.6.32, php-7.2.0RC5, php-7.0.25, php-7.1.11RC1, php-7.2.0RC4, php-7.0.25RC1, php-7.1.10, php-7.2.0RC3, php-7.0.24, php-7.2.0RC2, php-7.1.10RC1, php-7.0.24RC1, php-7.1.9, php-7.2.0RC1, php-7.0.23, php-7.1.9RC1, php-7.2.0beta3, php-7.0.23RC1, php-7.1.8, php-7.2.0beta2, php-7.0.22 |
|
#
f4a1d9c8 |
| 28-Jul-2017 |
Nikita Popov |
Fixed bug #65544 and #71298 |
#
582a65b0 |
| 27-Jul-2017 |
Nikita Popov |
Implement full case mapping Implement full case mapping according to SpecialCasing.txt and also full case folding according to CaseFolding.txt (F). There are a number of caveats:
Implement full case mapping Implement full case mapping according to SpecialCasing.txt and also full case folding according to CaseFolding.txt (F). There are a number of caveats: * Only language-agnostic and unconditional full case mapping is implemented. The only language-agnostic conditional case mapping rule relates to Greek sigma in final position (Final_Sigma). Correctly handling this requires both arbitrary lookahead and lookbehind, which would require some larger changes to how the case mapping is implemented. This is a possible future extension. * The only language-specific handling that is implemented is for Turkish dotted/undotted Is, if the ISO-8859-9 encoding is used. This matches the previous behavior and makes sure that no codepoints not supported by the encoding are produced. A future extension would be to also handle the Turkish mappings specified by SpecialCasing.txt based on the mbfl internal language. * Full case folding is implemented, but case-insensitive mb_* operations continue to use simple case folding. The reason is that full case folding of the haystack string may change the position at which a match occurred. This would have to be mapped back into the position in the original string. * mb_convert_case() exposes both the full and the simple case mapping / folding, where full is the default. The constants are: * MB_CASE_LOWER (used by mb_strtolower) * MB_CASE_UPPER (used by mb_strtolower) * MB_CASE_TITLE * MB_CASE_FOLD * MB_CASE_LOWER_SIMPLE * MB_CASE_UPPER_SIMPLE * MB_CASE_TITLE_SIMPLE * MB_CASE_FOLD_SIMPLE (used by case-insensitive operations)
show more ...
|
#
9ac7c1e7 |
| 27-Jul-2017 |
Nikita Popov |
Use case-folding for case insensitive comparisons Instead of using lowercasing. |
#
80a0601f |
| 25-Jul-2017 |
Nikita Popov |
Use MPH for case maps Instead of performing a binary search, use a hashtable to store the case maps. In particular a minimal perfect hash construction is used, which does not require
Use MPH for case maps Instead of performing a binary search, use a hashtable to store the case maps. In particular a minimal perfect hash construction is used, which does not require collision resolution (but does use an auxiliary table for the hash perturbation).
show more ...
|
#
eacd70f7 |
| 25-Jul-2017 |
Nikita Popov |
Don't store titlecase if same as uppercase The totitle code already has a fallback for that case. |
#
cedfc2f4 |
| 25-Jul-2017 |
Nikita Popov |
Drop implementation-specific character properties No point in keeping around non-standard character properties if we're not using them and most are not even being populated. |
#
8ace7045 |
| 25-Jul-2017 |
Nikita Popov |
Handle character ranges in ucgendat generically In particular, the previous implementation did not account for Tangut Ideographs and CJK Ideograph extensions C through F. |
#
0c0e35fe |
| 25-Jul-2017 |
Nikita Popov |
Port ucgendat to PHP Implemented such that the output is identical, including some quirks that should be fixed subsequently. |