Revision tags: php-8.2.0RC1, php-8.1.10, php-8.0.23, php-8.0.23RC1, php-8.1.10RC1, php-8.2.0beta3, php-8.2.0beta2, php-8.1.9, php-8.0.22, php-8.1.9RC1, php-8.2.0beta1, php-8.0.22RC1, php-8.0.21, php-8.1.8, php-8.2.0alpha3, php-8.1.8RC1, php-8.2.0alpha2, php-8.0.21RC1, php-8.0.20, php-8.1.7, php-8.2.0alpha1, php-7.4.30, php-8.1.7RC1, php-8.0.20RC1, php-8.1.6, php-8.0.19, php-8.1.6RC1, php-8.0.19RC1, php-8.0.18, php-8.1.5, php-7.4.29, php-8.1.5RC1, php-8.0.18RC1, php-8.1.4, php-8.0.17, php-8.1.4RC1, php-8.0.17RC1, php-8.1.3, php-8.0.16, php-7.4.28, php-8.1.3RC1, php-8.0.16RC1, php-8.1.2, php-8.0.15 |
|
#
6cf30356 |
| 09-Jan-2022 |
Alex Dowad |
Implement fast text conversion interface for SJIS-mac
|
Revision tags: php-8.1.2RC1, php-8.0.15RC1, php-8.0.14, php-8.1.1, php-7.4.27, php-8.1.1RC1, php-8.0.14RC1, php-7.4.27RC1, php-8.1.0, php-8.0.13, php-7.4.26, php-7.3.33, php-8.1.0RC6, php-7.4.26RC1, php-8.0.13RC1, php-8.1.0RC5, php-7.3.32, php-7.4.25, php-8.0.12, php-8.1.0RC4, php-8.0.12RC1, php-7.4.25RC1, php-8.1.0RC3, php-8.0.11, php-7.4.24, php-7.3.31, php-8.1.0RC2, php-7.4.24RC1, php-8.0.11RC1, php-8.1.0RC1 |
|
#
d2f5a8b3 |
| 31-Aug-2021 |
Alex Dowad |
Add more tests for SJIS-mac text conversion
|
#
776296e1 |
| 30-Aug-2021 |
Alex Dowad |
mbstring no longer provides 'long' substitutions for erroneous input bytes Previously, mbstring had a special mode whereby it would convert erroneous input byte sequences to output like
mbstring no longer provides 'long' substitutions for erroneous input bytes Previously, mbstring had a special mode whereby it would convert erroneous input byte sequences to output like "BAD+XXXX", where "XXXX" would be the erroneous bytes expressed in hexadecimal. This mode could be enabled by calling `mb_substitute_character("long")`. However, accurately reproducing input byte sequences from the cached state of a conversion filter is often tricky, and this significantly complicates the implementation. Further, the means used for passing the erroneous bytes through to where the "BAD+XXXX" text is generated only allows for up to 3 bytes to be passed, meaning that some erroneous byte sequences are truncated anyways. More to the point, a search of publically available PHP code indicates that nobody is really using this feature anyways. Incidentally, this feature also provided error output like "JIS+XXXX" if the input 'should have' represented a JISX 0208 codepoint, but it decodes to a codepoint which does not exist in the JISX 0208 charset. Similarly, specific error output was provided for non-existent JISX 0212 codepoints, and likewise for JISX 0213, CP932, and a few other charsets. All of that is now consigned to the flames. However, "long" error markers also include a somewhat more useful "U+XXXX" marker for Unicode codepoints which were successfully decoded from the input text, but cannot be represented in the output encoding. Those are still supported. With this change, there is no need to use a variety of special values in the high bits of a wchar to represent different types of error values. We can (and will) just use a single error value. This will be equal to -1. One complicating factor: Text conversion functions return an integer to indicate whether the conversion operation should be immediately aborted, and the magic 'abort' marker is -1. Also, almost all of these functions would return the received byte/codepoint to indicate success. That doesn't work with the new error value; if an input filter detects an error and passes -1 to the output filter, and the output filter returns it back, that would be taken to mean 'abort'. Therefore, amend all these functions to return 0 for success.
show more ...
|
Revision tags: php-7.4.23, php-8.0.10, php-7.3.30, php-8.1.0beta3, php-8.0.10RC1, php-7.4.23RC1, php-8.1.0beta2, php-8.0.9, php-7.4.22 |
|
#
51b9d7a5 |
| 27-Jul-2021 |
Alex Dowad |
Test behavior of 'long' illegal character markers After mb_substitute_character("long"), mbstring will respond to erroneous input by inserting 'long' error markers into the output. D
Test behavior of 'long' illegal character markers After mb_substitute_character("long"), mbstring will respond to erroneous input by inserting 'long' error markers into the output. Depending on the situation, these error markers will either look like BAD+XXXX (for general bad input), U+XXXX (when the input is OK, but it converts to Unicode codepoints which cannot be represented in the output encoding), or an encoding-specific marker like JISX+XXXX or W932+XXXX. We have almost no tests for this feature. Add a bunch of tests to ensure that all our legacy encoding handlers work in a reasonable way when 'long' error markers are enabled.
show more ...
|
Revision tags: php-8.1.0beta1, php-7.4.22RC1, php-8.0.9RC1, php-8.1.0alpha3, php-7.4.21, php-7.3.29, php-8.0.8, php-8.1.0alpha2, php-7.4.21RC1, php-8.0.8RC1 |
|
#
39131219 |
| 11-Jun-2021 |
Nikita Popov |
Migrate more SKIPIF -> EXTENSIONS (#7139) This is a mix of more automated and manual migration. It should remove all applicable extension_loaded() checks outside of skipif.inc files.
|
Revision tags: php-8.1.0alpha1, php-8.0.7, php-7.4.20, php-8.0.7RC1, php-7.4.20RC1, php-8.0.6, php-7.4.19, php-7.4.18, php-7.3.28, php-8.0.5, php-8.0.5RC1, php-7.4.18RC1, php-8.0.4RC1, php-7.4.17RC1, php-8.0.3, php-7.4.16, php-8.0.3RC1, php-7.4.16RC1, php-8.0.2, php-7.4.15, php-7.3.27, php-8.0.2RC1, php-7.4.15RC2, php-7.4.15RC1, php-8.0.1, php-7.4.14, php-7.3.26, php-7.4.14RC1, php-8.0.1RC1, php-7.3.26RC1, php-8.0.0, php-7.3.25, php-7.4.13, php-8.0.0RC5 |
|
#
c9fea7db |
| 14-Nov-2020 |
Alex Dowad |
Convert U+00AF (MACRON) to 0x8150 (FULLWIDTH MACRON) in some SJIS variants Except for vanilla Shift-JIS, where 0x7E is a halfwidth overline/macron. As for Shift-JIS-2004, it has an added
Convert U+00AF (MACRON) to 0x8150 (FULLWIDTH MACRON) in some SJIS variants Except for vanilla Shift-JIS, where 0x7E is a halfwidth overline/macron. As for Shift-JIS-2004, it has an added character (byte sequence 0x854A) which was defined as a halfwidth macron in JIS X 0213:2000, so we use that.
show more ...
|
#
ecf71847 |
| 14-Nov-2020 |
Alex Dowad |
Convert U+FF5E (FULLWIDTH TILDE) to 0x8160 (WAVE DASH) in SJIS variants By entering this character in the JIS X 0208 conversion table, we can remove a bunch of explicit `if` clauses in d
Convert U+FF5E (FULLWIDTH TILDE) to 0x8160 (WAVE DASH) in SJIS variants By entering this character in the JIS X 0208 conversion table, we can remove a bunch of explicit `if` clauses in different conversion filters. It also means that U+FF5E can be converted into SJIS-mac now; I don't know why this one SJIS variant rejected U+FF5E before, since 0x8160 means the same thing in SJIS-mac as the others.
show more ...
|
#
4f3bd2e2 |
| 14-Nov-2020 |
Alex Dowad |
Convert U+203E (OVERLINE) to 0x8150 (FULLWIDTH MACRON) in some SJIS variants Converting U+203E to 0x7E was especially wrong for CP932, where 0x7E represents a tilde. For vanilla
Convert U+203E (OVERLINE) to 0x8150 (FULLWIDTH MACRON) in some SJIS variants Converting U+203E to 0x7E was especially wrong for CP932, where 0x7E represents a tilde. For vanilla Shift-JIS and Shift-JIS-2004, converting to 0x7E is acceptable, since 0x7E does represent an overline/macron in those encodings. Follow the same principle in CP51932, which is closely related to CP932.
show more ...
|
Revision tags: php-7.4.13RC1, php-8.0.0RC4, php-7.3.25RC1, php-7.4.12, php-8.0.0RC3, php-7.3.24, php-8.0.0RC2, php-7.4.12RC1, php-7.3.24RC1, php-7.2.34, php-8.0.0rc1, php-7.4.11, php-7.3.23, php-8.0.0beta4, php-7.4.11RC1, php-7.3.23RC1 |
|
#
1cf12c02 |
| 09-Sep-2020 |
Alex Dowad |
Add test suite for SJIS-mac encoding
|