encoding_tests.inc - OpenGrok history log for /PHP-8.2/ext/mbstring/tests/encoding

Revision	Date	Author	Comments (<<< Hide modified files) (Show modified files >>>)
# 297fec09	01-Jul-2023	Niels Dossche <7771979+nielsdos@users.noreply.github.com>	Merge branch 'PHP-8.1' into PHP-8.2 * PHP-8.1: Fix GH-11300: license issue: restricted unicode license headers
# ee42621f	01-Jul-2023	Niels Dossche <7771979+nielsdos@users.noreply.github.com>	Fix GH-11300: license issue: restricted unicode license headers Closes GH-11572. /PHP-8.2/ext/mbstring/tests/encoding_tests.inc
# 3c732251	21-Jul-2021	Alex Dowad	New internal interface for fast text conversion in mbstring When converting text to/from wchars, mbstring makes one function call for each and every byte or wchar to be converted. Typica New internal interface for fast text conversion in mbstring When converting text to/from wchars, mbstring makes one function call for each and every byte or wchar to be converted. Typically, each of these conversion functions contains a state machine, and its state has to be restored and then saved for every single one of these calls. It doesn't take much to see that this is grossly inefficient. Instead of converting one byte or wchar on each call, the new conversion functions will either fill up or drain a whole buffer of wchars on each call. In benchmarks, this is about 3-10× faster. Adding the new, faster conversion functions for all supported legacy text encodings still needs some work. Also, all the code which uses the old-style conversion functions needs to be converted to use the new ones. After that, the old code can be dropped. (The mailparse extension will also have to be fixed up so it will still compile.) show more ... /PHP-8.2/ext/mbstring/tests/encoding_tests.inc
# ff85ed8a	19-Jun-2021	Alex Dowad	Fix conversion of EUC-TW text (and add test suite) - Treat text which ends abruptly in the middle of a multi-byte character as erroneous. - Don't allow ASCII control characters to Fix conversion of EUC-TW text (and add test suite) - Treat text which ends abruptly in the middle of a multi-byte character as erroneous. - Don't allow ASCII control characters to appear in the middle of a multi-byte character. - If an illegal byte appears in the middle of a multi-byte character, go back to the initial state rather than trying to finish the multi-byte character. - There was a bug in the file with the conversion tables, which set the 'maximum codepoint which can be converted using table A2' using the size of table A1, not table A2. This meant that several hundred Unicode codepoints which should have been able to be converted to EUC-TW were flagged as erroneous instead. - When a sequence which cannot possibly be a prefix of a valid multi-byte character is found, immediately flag it as an error, rather than waiting to read more bytes first. - Allow characters in CNS-11643 plane 1 to be encoded as 4-byte sequences (although they can also be encoded as 2-byte sequences). This is allowed by the standard for EUC-TW text. show more ... /PHP-8.2/ext/mbstring/tests/encoding_tests.inc
# b489c1bc	16-Nov-2020	Alex Dowad	Bugfixes for findInvalidChars (helper for mbstring test suite) /PHP-8.2/ext/mbstring/tests/encoding_tests.inc
# d1d50c2b	09-Nov-2020	Alex Dowad	Test EUC-JP and Shift-JIS more thoroughly Previously, the unit tests for these text encodings covered all mappings from legacy -> Unicode, and all _reversible_ mappings from Unicode -> l Test EUC-JP and Shift-JIS more thoroughly Previously, the unit tests for these text encodings covered all mappings from legacy -> Unicode, and all _reversible_ mappings from Unicode -> legacy. However, we should also test the few Unicode -> legacy mappings which are not reversible. show more ... /PHP-8.2/ext/mbstring/tests/encoding_tests.inc
# 3eb8828d	05-Nov-2020	Alex Dowad	Fix issues with mbstring encoding tests I made some mistakes on this code, which meant that not everything which should be tested was actually being tested. /PHP-8.2/ext/mbstring/tests/encoding_tests.inc
# b18b9c9e	02-Nov-2020	Alex Dowad	Test cases for mbstring encodings are less repetitive /PHP-8.2/ext/mbstring/tests/encoding_tests.inc
# 831abe2d	18-Oct-2020	Alex Dowad	Add test suite for CP1252 encoding Also remove a bogus test (bug62545.phpt) which wrongly assumed that all invalid characters in CP1251 and CP1252 should map to Unicode 0xFFFD (REPLACEME Add test suite for CP1252 encoding Also remove a bogus test (bug62545.phpt) which wrongly assumed that all invalid characters in CP1251 and CP1252 should map to Unicode 0xFFFD (REPLACEMENT CHARACTER). mbstring has an interface to specify what invalid characters should be replaced with; it's called `mb_substitute_character`. If a user wants to see the Unicode 'replacement character', they can specify that using `mb_substitute_character`. But if they specify something else, we should follow that. show more ... /PHP-8.2/ext/mbstring/tests/encoding_tests.inc
# 84c180d8	19-Sep-2020	Alex Dowad	Add test suite for ISO-8859-x encoding verification and conversion /PHP-8.2/ext/mbstring/tests/encoding_tests.inc