#
04e59c91 |
| 15-Apr-2022 |
Alex Dowad |
Error handling for UTF-8 complies with WHATWG specification In 7502c86342, I adjusted the number of error markers emitted on invalid UTF-8 text to be more consistent with mbstring's beha
Error handling for UTF-8 complies with WHATWG specification In 7502c86342, I adjusted the number of error markers emitted on invalid UTF-8 text to be more consistent with mbstring's behavior on other text encodings (generally, it emits one error marker for one unexpected byte). I didn't expect that anybody would actually care one way or the other, but felt that it was better to be consistent than not. Later, Martin Auswöger kindly pointed out that the WHATWG encoding specification, which governs how various text encodings are handled by web browsers, does actually specify how many error markers should be generated for any given piece of invalid UTF-8 text. Until now, we have never really paid much attention to the WHATWG specification, but we do want to comply with as many relevant specifications as possible. And since PHP is commonly used for web applications, compatibility with the behavior of web browsers is obviously a good thing.
show more ...
|