History log of /php-src/ext/mbstring/tests/data/EUC-TW.txt (Results 1 – 1 of 1)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
Revision tags: php-8.2.0RC1, php-8.1.10, php-8.0.23, php-8.0.23RC1, php-8.1.10RC1, php-8.2.0beta3, php-8.2.0beta2, php-8.1.9, php-8.0.22, php-8.1.9RC1, php-8.2.0beta1, php-8.0.22RC1, php-8.0.21, php-8.1.8, php-8.2.0alpha3, php-8.1.8RC1, php-8.2.0alpha2, php-8.0.21RC1, php-8.0.20, php-8.1.7, php-8.2.0alpha1, php-7.4.30, php-8.1.7RC1, php-8.0.20RC1, php-8.1.6, php-8.0.19, php-8.1.6RC1, php-8.0.19RC1, php-8.0.18, php-8.1.5, php-7.4.29, php-8.1.5RC1, php-8.0.18RC1, php-8.1.4, php-8.0.17, php-8.1.4RC1, php-8.0.17RC1, php-8.1.3, php-8.0.16, php-7.4.28, php-8.1.3RC1, php-8.0.16RC1, php-8.1.2, php-8.0.15, php-8.1.2RC1, php-8.0.15RC1, php-8.0.14, php-8.1.1, php-7.4.27, php-8.1.1RC1, php-8.0.14RC1, php-7.4.27RC1, php-8.1.0, php-8.0.13, php-7.4.26, php-7.3.33, php-8.1.0RC6, php-7.4.26RC1, php-8.0.13RC1, php-8.1.0RC5, php-7.3.32, php-7.4.25, php-8.0.12, php-8.1.0RC4, php-8.0.12RC1, php-7.4.25RC1, php-8.1.0RC3, php-8.0.11, php-7.4.24, php-7.3.31, php-8.1.0RC2, php-7.4.24RC1, php-8.0.11RC1, php-8.1.0RC1, php-7.4.23, php-8.0.10, php-7.3.30, php-8.1.0beta3, php-8.0.10RC1, php-7.4.23RC1, php-8.1.0beta2, php-8.0.9, php-7.4.22, php-8.1.0beta1, php-7.4.22RC1, php-8.0.9RC1, php-8.1.0alpha3, php-7.4.21, php-7.3.29, php-8.0.8, php-8.1.0alpha2
# ff85ed8a 19-Jun-2021 Alex Dowad

Fix conversion of EUC-TW text (and add test suite)

- Treat text which ends abruptly in the middle of a multi-byte
character as erroneous.
- Don't allow ASCII control characters to

Fix conversion of EUC-TW text (and add test suite)

- Treat text which ends abruptly in the middle of a multi-byte
character as erroneous.
- Don't allow ASCII control characters to appear in the middle of a
multi-byte character.
- If an illegal byte appears in the middle of a multi-byte character,
go back to the initial state rather than trying to finish the
multi-byte character.
- There was a bug in the file with the conversion tables, which set the
'maximum codepoint which can be converted using table A2' using the
size of table A1, not table A2. This meant that several hundred
Unicode codepoints which should have been able to be converted to
EUC-TW were flagged as erroneous instead.
- When a sequence which cannot possibly be a prefix of a valid
multi-byte character is found, immediately flag it as an error, rather
than waiting to read more bytes first.
- Allow characters in CNS-11643 plane 1 to be encoded as 4-byte
sequences (although they can also be encoded as 2-byte sequences).
This is allowed by the standard for EUC-TW text.

show more ...